Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow hit runahead on reload but can't figure out why #6517

Open
ColemanTom opened this issue Dec 5, 2024 · 2 comments
Open

Workflow hit runahead on reload but can't figure out why #6517

ColemanTom opened this issue Dec 5, 2024 · 2 comments
Labels
bug Something is wrong :( needs reproducing A bug report that does not yet have a reproducible example
Milestone

Comments

@ColemanTom
Copy link
Contributor

Description

Sorry for the vaguenss, I'm trying to figure out more information, but am juggling a lot. This is for multiple workflows.

  • They were running with CYLC_VERSION=8.3.5

  • I stopped them

  • I did a cylc vr with CYLC_VERSION=8.3.6 (and various small updates)

  • Log files are only showing WARNING and ERROR, no INFO (frustratingly I had the -q option on), so there isn't much information in there

  • Workflow has stalled, hitting a runahead limit - but I can't tell how or why or what runahead limit is hit because there is nothing in the logs about it. I don't think the workflow recognises it is stalled. The task in the image below which has run I manually triggered to see it would register.

    image

  • In the above image, the start_ops task does have an xtrigger for it, no other pre-reqs. The xtrigger was not satisfied, but could have been if it actually run (I believe xtriggers don't run if the runahead is met?)

  • If I make a job fail, I see the WARNING message printed, so logs are still updating

  • 3 cycles ran without issue, but the fourth has not started

  • There is no graph offset that goes more than one cycle (6 hourly cycling model)

    $ grep -Po '\[-P.*?\]' flow-processed.cylc | sort -u
    [-PT6H]
    

Reproducible Example

Haven't got one yet.

Expected Behaviour

Suite wouldn't stall, or if it does, at least say it did in the logs. Given it hasn't run for 12 hours, it has stalled.

@ColemanTom ColemanTom added the bug Something is wrong :( label Dec 5, 2024
@ColemanTom ColemanTom changed the title Logs only showing warning and errors? Suite hit runahead on reload but can't figure out why Dec 5, 2024
@MetRonnie
Copy link
Member

  • Log files are only showing WARNING and ERROR, no INFO (frustratingly I had the -q option on), so there isn't much information in there

Just to quickly help with this part, you can change the log level for a running workflow using the "set-verbosity" command/option in the workflow menu in the GUI

@MetRonnie MetRonnie added the needs reproducing A bug report that does not yet have a reproducible example label Dec 6, 2024
@oliver-sanders oliver-sanders added this to the 8.x milestone Jan 8, 2025
@oliver-sanders
Copy link
Member

Did the issue persist with reload/restart?

The output of the cylc dump command would be handy here, alternatively, if you still have the workflow DB handy, the task_pool table contains this info.

@MetRonnie MetRonnie changed the title Suite hit runahead on reload but can't figure out why Workflow hit runahead on reload but can't figure out why Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :( needs reproducing A bug report that does not yet have a reproducible example
Projects
None yet
Development

No branches or pull requests

3 participants