Skip to content

Commit

Permalink
Add a limit to expunging annotations
Browse files Browse the repository at this point in the history
Fixes #8494.

I don't know if this has been causing us any problems or not but in
theory if a large number of annotations have been marked as deleted (for
example: maybe a user account that had a lot of annotations has been
deleted) the `bulk_delete()` method will try to expunge all of these
annotations from the DB at once which could cause a long-running
`DELETE` query to hold a lock on the `annotation` table for too long and
disrupt the DB service.

It seems a no-brainer to just add a fixed limit to this. The method runs
periodically so over time it will eventually expunge all deleted
annotations but there's no hurry, no need to try to do them all at once.
  • Loading branch information
seanh committed Feb 14, 2024
1 parent 718119f commit cfd86de
Showing 1 changed file with 24 additions and 8 deletions.
32 changes: 24 additions & 8 deletions h/services/annotation_delete.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from datetime import datetime, timedelta

from sqlalchemy import delete, select

from h.events import AnnotationEvent
from h.models import Annotation
from h.services.annotation_write import AnnotationWriteService
Expand Down Expand Up @@ -37,14 +39,28 @@ def delete_annotations(self, annotations):

def bulk_delete(self):
"""Expunge annotations marked as deleted from the database."""
self.request.db.query(Annotation).filter_by(deleted=True).filter(
# Deletes all annotations flagged as deleted more than 10 minutes ago. This
# buffer period should ensure that this task doesn't delete annotations
# deleted just before the task runs, which haven't yet been processed by the
# streamer.
Annotation.updated
< datetime.utcnow() - timedelta(minutes=10)
).delete()
self.request.db.execute(
delete(Annotation).where(
Annotation.id.in_(
select(
select(Annotation.id)
.where(Annotation.deleted.is_(True))
# Wait ten minutes before expunging an annotation to
# give the streamer time to process the deletion.
.where(
Annotation.updated
< datetime.utcnow() - timedelta(minutes=10)
)
# Only expunge up to 1000 annotations at a time to
# avoid long-running DB queries. This method is called
# periodically so eventually all deleted annotations
# will get expunged.
.limit(1000)
.cte()
)
)
)
)


def annotation_delete_service_factory(_context, request):
Expand Down

0 comments on commit cfd86de

Please sign in to comment.