-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Labels
Description
Describe the feature you'd like
It takes longer to execute GetExperimentGoalCount
method and GetExperimentEvaluationCount
method sometimes.
GetExperimentGoalCount (pkg/eventcounter/storage/v2/sql/goal_count.sql
)
Root Causes:
- Redundant DISTINCT operation: The CTE
grouped_by_user_evaluation
already groups by(user_id, variation_id)
, making each row unique. The outer query'sCOUNT(DISTINCT user_id)
is recalculating uniqueness unnecessarily - should useCOUNT(*)
instead - Unnecessary ORDER BY clause: The query contains
ORDER BY variation_id
which forces BigQuery to sort results before returning, but the application code already handles ordering based onreq.VariationIds
- this ORDER BY should be removed - Missing variation_id filter: The query aggregates ALL variations then filters in application code. Should add
WHERE variation_id IN UNNEST(@variationIDs)
to reduce data scanned - BigQuery shuffle overhead: The combination of DISTINCT + ORDER BY forces expensive data redistribution across nodes - removing both will eliminate shuffle operations
GetExperimentEvaluationCount (pkg/eventcounter/storage/v2/sql/evaluation_count.sql
)
Root Causes:
- Direct COUNT(DISTINCT) on raw data: The query runs
COUNT(DISTINCT user_id)
directly on the full dataset without pre-aggregation. Should use CTE withGROUP BY user_id, variation_id
first, thenCOUNT(*)
on grouped results - No staged aggregation: Single-pass query computes both
COUNT(DISTINCT user_id)
andCOUNT(id)
simultaneously on raw data. Should separate into CTE for pre-grouping, then aggregate the grouped data - Missing variation_id filter: Like goal_count.sql, processes all variations without filtering. Should add
WHERE variation_id IN UNNEST(@variationIDs)
to reduce scan scope - Inconsistent pattern with goal_count.sql: Uses different aggregation approach than goal_count.sql. Should align to use same CTE-based pattern for consistency and performance
Background
No response
Alternative solutions
No response