-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check Goobi's working storage and clear it out #407
Comments
I'll move this back in TODO |
I've done some clearing out of Goobi's working storage this week. I removed ~10TB of ALTO files and JP2 images which I could match to content in the storage service (and thus was definitely redundant). For more details, see the individual tickets (wellcomecollection/platform#5399, wellcomecollection/platform#5404). The next step to reducing costs is #423, which is quick and easy. If somebody wanted to clean this up further, here's where I'd start:
|
I did another pass of deletions last week, which has had a big effect on the size of the bucket:
This was backfilling deletions that are applied to new Goobi processes, but haven’t been applied to stuff from 2021 and earlier:
I have another idea for a deletion pass, but it's going to take a while to pull down all the files to evaluate that. |
In looking at our AWS bills, it may be that we have way too much stuff stored in Goobi's working storage.
I have suspected that since the migration our image cleanup steps may not be working correctly to clean this stuff up so that's why I've labeled it a bug.
Searching "stepinwork:Image removal" gives me 1100+ items that are in progress on this step and have been for months, for example.
Can we check and see what's in our working storage? How much stuff is in there?
Can we make sure the image removal steps are already working?
And we'll need to clear out anything that's finished beyond the DDS API call and has been for more than 90 days, as that's the usual time we'd save the working images.
Alex Chan is a good one to ask about this while I'm away.
The text was updated successfully, but these errors were encountered: