-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle job log directory deleted for active task #6425
Comments
@oliver-sanders - do you have a recipe for reproduction of this bug. From the description I tried [scheduling]
cycling mode = integer
[[graph]]
R1 = task:started => housekeep
[runtime]
[[task]]
script = sleep 12
platform = remote # Tried this in case it were necessary
[[housekeep]]
script = """
RMTHIS=${CYLC_WORKFLOW_RUN_DIR}/log/job/1/task
echo "Housekeeping ${RMTHIS}"
rm -fr "${RMTHIS}"
""" |
No id don't, your example seems lonjg the right lines, you might want to try using a batch system rather than background as the polling is different. |
Finally have a replicable example (Thank you @oliver-sanders) [scheduling]
cycling mode = integer
[[graph]]
R1 = task
[runtime]
[[task]]
script = """
rm ${CYLC_WORKFLOW_RUN_DIR}/.service/contact
rm -r "${CYLC_WORKFLOW_RUN_DIR}/log/job/${CYLC_TASK_CYCLE_POINT}/${CYLC_TASK_NAME}"
"""
platform = _remote_pbs |
Just to note from #6577:
|
Spotted in the wild!
If you delete the job log directory for an active task, Cylc will preserve its last known status indefinitely. I.e, Cylc will consider the job to be submitted/running forever.
In this case it was caused by a housekeep task being triggered whilst other tasks in the cycle were still running. The housekeep task tarred up the
log/job/<cycle>
dir removing the job status files in the process.This situation should be handled similarly to the job no longer appearing in the queue, i.e, the job is dead, long live the job. Stick it into the failed/submit-failed state as appropriate.
The text was updated successfully, but these errors were encountered: