-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent usage of dbm causes cache to go stale #104
Comments
David Gardner () wrote: Looked into this a bit more, since my issue is with concurrent processes and not concurrent threads I think all that the MutexLock was providing for me was bypassing the default FileLock implementation. In my case anydbm is picking dbhash which has it's own locking. I was also able to avoid the problem by simply setting rw_lockfile=False |
Michael Bayer (zzzeek) wrote: well, lockfiles can be weird. I can't reproduce any problem. I added this:
ran it in three windows, I see:
are you in one of the many danger areas for lockfiles? e.g. Windows, NFS shares, weird file systems, containers, etc. ? |
David Gardner () wrote: Its a typical Linux setup on an ext4 file system. I added your iteration counter, and ran my test launching 10 instances of the script at near the same time with:
Around iteration 50000 or so the script starts complaining about the age of the cache. |
Michael Bayer (zzzeek) wrote: shrugs, lockfiles. is there evidence that a lockfile is being held open permanently ? |
David Gardner () wrote: How would I check that? |
Michael Bayer (zzzeek) wrote: hmmm probably need to put more debugging into the lock code itself, set a timer when the file lock is acquired, somehow have it print out how long it's been held, then perhaps shutdown other processes and see if that one keeps holding opened. another way would be to use linux commands, there's a util lslocks that you can try which would show whos locking it. lockfiles are just like this, they have weird problems. |
would need to revisit this and try running the test case again to try reproducing. |
Migrated issue, originally created by David Gardner ()
Ran into this issue in production where users were reporting items in the cache being an hour old for a cache region configured with a one minute expiration time. Problem first observed with version 0.5.7, and reproduced with version 0.6.1.
I was able to reproduce the issue in a simple test of a function that returns datetime.now() (dogpile-test.py), if I run about 10 concurrent instances of the script after about a minute or so they will start reporting stale data.
However I noticed as a work-around if I use the MutexLock class from the lock_factory documentation, I don't run into the problem:
http://dogpilecache.readthedocs.io/en/latest/api.html?highlight=dogpile.cache.dbm#dogpile.cache.backends.file.DBMBackend.params.lock_factory
Attachments: dogpile-test.py
The text was updated successfully, but these errors were encountered: