Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runners in a stuck state after the Actions outage #3334

Open
4 tasks done
Scalahansolo opened this issue Jun 12, 2024 · 2 comments
Open
4 tasks done

Runners in a stuck state after the Actions outage #3334

Scalahansolo opened this issue Jun 12, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Scalahansolo
Copy link

Checks

Controller Version

0.9.1

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

Cannot repro as this only happened due to the outage

Describe the bug

After the Actions outage yesterday, all of the runners in my runner group ended up in the following state. In the Github UI, it says this runner has an active job which is just a failed job due the outage.

CleanShot 2024-06-12 at 09 19 17@2x

The logs of the actual runner seem fine, and it's just waiting to be assigned a job properly.

CleanShot 2024-06-12 at 09 21 44@2x

Describe the expected behavior

I would have expected these failed jobs to not be listed as "active" in my runners. Im guessing because these failed jobs are still marked as active in by Github, new jobs are not being assinged to these runners.

Additional Context

N/A

Controller Logs

N/A

Runner Pod Logs

N/A
@Scalahansolo Scalahansolo added the bug Something isn't working label Jun 12, 2024
@nikola-jokic nikola-jokic transferred this issue from actions/actions-runner-controller Jun 13, 2024
@nikola-jokic
Copy link
Member

Transferred the issue here since it is related to the runner itself, and not ARC.

@Scalahansolo
Copy link
Author

As a quick update here. The only way I could get these runners healthy again was I had to track down all those "Active Jobs" that were in the failed state (this took forever), and use the Github API to hard delete those runs out of Github. Once I deleted all of those, after a bit Github started to see those runners as idle and started to assign new jobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants