Skip to content

Commit

Permalink
fix(statistics): use subquery instead of join to avoid cartesian product
Browse files Browse the repository at this point in the history
The way we filter in the statistics view, any added filter (that affects
reports) adds a "dimension" to the cartesian product, exploding the total
number of hours reported.

Instead of using JOIN, we do EXISTS(SUBQUERY) now, which should avoid this
issue. Might be a tiny bit slower, but let's try to make it correct first, then fast.
  • Loading branch information
winged committed Dec 23, 2024
1 parent 181db3f commit 084550f
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions backend/timed/reports/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from zipfile import ZipFile

from django.conf import settings
from django.db.models import F, Q, QuerySet, Sum
from django.db.models import F, Q, Exists, OuterRef, QuerySet, Sum
from django.db.models.functions import ExtractMonth, ExtractYear
from django.http import HttpResponse
from django.utils.http import content_disposition_header
Expand Down Expand Up @@ -117,9 +117,14 @@ def filter(self, /, **kwargs):
return new_qs

def filter_base(self, *args, **kwargs):
filtered = (
self.model.objects.filter(*args, **kwargs)
.values("pk")
.filter(pk=OuterRef("pk"))
)
return StatisticQueryset(
model=self.model,
base_qs=self._base.filter(*args, **kwargs),
base_qs=self._base.filter(Exists(filtered)),
catch_prefixes=self._catch_prefixes,
agg_filters=self._agg_filters,
)
Expand Down

0 comments on commit 084550f

Please sign in to comment.