Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(queue): Add failed job backoff configuration #621

Merged
merged 2 commits into from
Jul 27, 2023

Conversation

deer-wmde
Copy link
Contributor

@deer-wmde deer-wmde commented Jul 20, 2023

Currently failed jobs will be retried 5 times[1]. This adds job backoff configuration as suggested here, so that failed retries will happen with incremental delays:

  • 10 seconds
  • 1 minute
  • 5 minutes
  • 15 minutes
  • 30 minutes

[1] https://github.com/wbstack/api/blob/main/start.sh#L20

Draft status because I couldn't make this work on my local cluster yet.

@m90
Copy link
Contributor

m90 commented Jul 24, 2023

When trying to test this, do you make jobs fail by calling $job->fail() or by throwing an exception? Not sure if it makes a difference, but maybe it's a first idea where to start debugging.

@deer-wmde
Copy link
Contributor Author

When trying to test this, do you make jobs fail by calling $job->fail() or by throwing an exception? Not sure if it makes a difference, but maybe it's a first idea where to start debugging.

I tried testing without changing code, I tried to create wikis while no mediawiki pods are around (so I guess this is equivalent to the exception throwing). Will try those two ways explicitly, I didn't know about the $job->fail method

@deer-wmde
Copy link
Contributor Author

deer-wmde commented Jul 24, 2023

When trying to test this, do you make jobs fail by calling $job->fail() or by throwing an exception? Not sure if it makes a difference, but maybe it's a first idea where to start debugging.

Turns out this makes ALL the difference. Calling $job->fail() (or rather $this->fail() inside a job) will tell Laravel to not retry that job.

While I'm now confident that this PR works as intended, this means that we have to refactor basically every Job class, that we want to respect the retry config (not just the one in this PR, but in general).

app/Jobs/Job.php Show resolved Hide resolved
config/queue.php Outdated
@@ -84,4 +84,5 @@
'table' => 'failed_jobs',
],

'backoff' => explode(',', env('QUEUE_BACKOFF', '10,60,300,900,1500')), // 10s, 1m, 5m, 15m, 30m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone would set QUEUE_BACKOFF to the empty string, would this use the default? I'm wondering if we should have a null default so that we can turn this behavior on and off by means of configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking, I removed the values

@deer-wmde deer-wmde merged commit 53a6b96 into main Jul 27, 2023
5 checks passed
@deer-wmde deer-wmde deleted the de/job-queue-backoff branch July 27, 2023 09:57
@deer-wmde deer-wmde mentioned this pull request Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants