Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in job's state #1204

Open
finalclass opened this issue Oct 12, 2018 · 2 comments · May be fixed by #1205
Open

Inconsistencies in job's state #1204

finalclass opened this issue Oct 12, 2018 · 2 comments · May be fixed by #1205

Comments

@finalclass
Copy link

kue version: 0.11.6

I'm experiencing a weird phenomena with some of our jobs. I have jobs in {q}:jobs:active ZSET that have their state set to failed.
I've tried to figure out how this is possible but I couldn't. My first suspect was that there was some external restart of the process during the job.state() function but the MUTLI is used there so it shouldn't cause any inconsistencies.

There is this queue.checkActiveJobTtl() mechanism that runs every second and in our case on some events we have a lot of these inconsistent jobs and these get processed every second which is causing an unnecessary load on our servers.

The simplest solution would be to add:

job._state = 'active';

here: https://github.com/Automattic/kue/blob/master/lib/kue.js#L245 however on one server I've noticed that we have inconsistency with jobs in the "incative" box (these are in inactive ZSET but their state is set to "failed")

@finalclass
Copy link
Author

Finally I know what's the problem.

So it's the refreshTtl function that is putting these jobs back to active list: https://github.com/Automattic/kue/blob/master/lib/queue/job.js#L346

This refreshTtl function is called when progress is set.
The thing is that we don't always wait for the progress callback to be called

So from time to time, a job finishes but later the progress (thus refreshTtl) runs and it adds the job back to active zset.

@finalclass
Copy link
Author

Unfortunately refreshTtl and Job.prototype.progress do not accept callbacks so it's impossible to fix it on our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant