-
Notifications
You must be signed in to change notification settings - Fork 458
Avoid per-window recomputation in log search custom windows #1826 #1941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
|
Does this close #1826? |
Collaborator
Author
|
Yes, but a few test cases are failing, so I’m checking on that. |
This change introduces a batched aggregation path for custom log search windows to avoid per-window recomputation. Instead of scanning log data separately for each window, the implementation computes metrics for all requested windows in a single pass over the full time range. For PostgreSQL, a single SQL query using generate_series and ordered-set aggregates computes per-window percentiles efficiently. For other databases, logs are fetched once per component/operation and bucketed into windows in Python. Key changes: - Add aggregate_all_components_batch() method to LogAggregator service - Modify _aggregate_custom_windows() to collect all windows first and delegate to batch method - Implement proper fallback to per-window aggregation if batch fails - Include db.rollback() before fallback to handle PostgreSQL failed state - Filter duration_ms IS NOT NULL in batch SQL JOIN for consistent error_count semantics with per-window path - Filter component/operation_type IS NOT NULL and non-empty in both PostgreSQL and Python paths for consistency with per-window behavior - Add warning log for large ranges in non-PostgreSQL fallback path - Add warning when sparse window_starts detected (batch generates full range) - Limit window list to 10000 entries, keeping most recent windows - Preserve existing aggregation semantics and filters - Add comprehensive tests for batch aggregation parity Performance impact: - Before: N windows × M component/operation pairs = N×M database queries - After PostgreSQL: 1 SQL query for all windows and pairs - After SQLite: M queries regardless of window count Closes #1826 Signed-off-by: Mihai Criveti <[email protected]>
1467af8 to
888048f
Compare
Member
Review and Fixes AppliedRebased on main and addressed the following issues identified during code review: High Priority Fixes
Medium Priority Fixes
Low Priority Fixes
Tests Added
All 34 tests pass. |
crivetimihai
approved these changes
Jan 8, 2026
Member
crivetimihai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased, updated, tested
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: NAYANAR [email protected]
closes #1826
This change introduces a batched aggregation path for custom log search windows to avoid per-window recomputation. Instead of scanning log data separately for each window, the implementation computes metrics for all requested windows in a single pass over the full time range. For PostgreSQL, a single SQL query using generate_series and ordered-set aggregates computes per-window percentiles efficiently. For other databases, logs are fetched once per component/operation and bucketed into windows in Python. Existing aggregation semantics and filters are preserved, while significantly reducing CPU usage and database load for large custom ranges.