Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use touch EXIT to stop CP2K at walltime end #51

Open
danieleongari opened this issue Oct 9, 2019 · 6 comments
Open

Use touch EXIT to stop CP2K at walltime end #51

danieleongari opened this issue Oct 9, 2019 · 6 comments

Comments

@danieleongari
Copy link
Collaborator

CP2K allows for a soft kill, by creating in the running folder an empty file called EXIT.
This solution con be implemented in the plugin to stop the code.

When using EXIT, CP2K stops at the current SCF step (returning a warning as if it did not converge) but printing the standard termination of the program, and therefore allowing for a smoother parsing of the output.

@yakutovicha
Copy link
Contributor

yakutovicha commented Oct 18, 2019

Another possibility would be to specify the walltime parameter in the cp2k input slightly smaller then it is set in the batch file.

@yakutovicha
Copy link
Contributor

I am not sure, though, if the input plugin can do something while the calculation is running. I am actually pretty sure it can't. @ltalirz do you think it is possible?

@yakutovicha
Copy link
Contributor

Made an issue upstream: aiidateam/aiida-core#3868. Let's see how this evolves.

@sphuber
Copy link

sphuber commented Mar 26, 2020

Another possibility would be to specify the walltime parameter in the cp2k input slightly smaller then it is set in the batch file.

If CP2K provides this functionality, I would definitely go for this route. This is what I do in Quantum ESPRESSO as well and works relatively well. I implement this directly on the PwBaseWorkChain, where I always take the metadata.options.resources.walltime_seconds input and set a fraction of that in the input parameters of the PwCalculation. This way, all other workflow always automatically inherit this behavior.

I responded to the issue Sasha opened and although the other possibility is in principle possible to be implemented, it is quite challenging I would say.

@dev-zero
Copy link
Contributor

dev-zero commented Apr 8, 2021

Would this also mean that if I'm going to kill a process the base workchain could possibly react to it by first trying to write an EXIT file to terminate the process on remote gracefully?

@sphuber
Copy link

sphuber commented Apr 8, 2021

Would this also mean that if I'm going to kill a process the base workchain could possibly react to it by first trying to write an EXIT file to terminate the process on remote gracefully?

Not really. The workchain execution is blocked until the child process (the CalcJob in this example) is terminated. That means the workchain cannot retake control and perform an action before the CalcJob has finished running. You could of course manually write an EXIT file in the working directory of the CalcJob which would cause the code to stop gracefully. The daemon will then realize the job is done and start retrieval and parsing. If the parser properly recognizes the graceful shutdown and sets an appropriate exit code, and the base restart has a handler for that exit code to simply start a new calcjob, restarting from the output of the last, then that works. That is exactly what the PwBaseWorkChain and PwCalculation do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants