-
Notifications
You must be signed in to change notification settings - Fork 319
feat: Optimize and fix filtering on toStartOfX primary key expressions #1265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 0646139 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
The latest updates on your projects. Learn more about Vercel for GitHub.
💡 Enable Vercel Agent with $100 free credit for automated AI reviews |
Pull Request ReviewThis PR introduces an optimization for filtering on ✅ Strengths
🔍 Code Quality ObservationsStrong Points:
Minor Suggestions:1. Error handling in The function logs an error but returns const columnArgument = args[0];
if (columnArgument == null) {
console.error(`Failed to parse column argument from ${expr}`);
return undefined; // Explicit return
} 2. Documentation in hook (packages/app/src/hooks/useOptimizedTimestampValueExpression.ts:63) The condition check on line 63 seems to exclude Metric sources from optimization. Consider adding a comment explaining why: // Metrics use a different schema structure that doesn't benefit from this optimization
if (source.kind === SourceKind.Metric)
return source.timestampValueExpression; 3. Test naming (packages/app/src/hooks/tests/useOptimizedTimestampValueExpression.test.tsx:120) The test description says "ungracefully" but it actually handles the case gracefully by returning it('should return undefined when source is null', () => { 4. Regex pattern robustness (packages/common-utils/src/renderChartConfig.ts:487) The regex 🎯 Performance ConsiderationsPositive:
Consider:
🔒 Security ConcernsNo major security issues identified. The code:
📊 Test CoverageExcellent coverage:
One suggestion: 📝 Additional Observations
🎓 Code Style Alignment with CLAUDE.md✅ Follows TypeScript strict typing guidelines 💡 RecommendationLGTM with minor suggestions. This is a well-implemented optimization with excellent test coverage. The minor suggestions above are optional improvements that don't block merging. The PR successfully addresses the performance issue while maintaining code quality and backward compatibility. Great work! 🚀 |
E2E Test Results✅ All tests passed • 25 passed • 3 skipped • 224s
|
2e4ca2e
to
1ca6d15
Compare
1ca6d15
to
39bc709
Compare
39bc709
to
36275d6
Compare
Pull Request Review - HyperDX #1265SummaryThis PR implements an optimization for ClickHouse queries with primary keys that use toStartOfX functions. The changes automatically detect and utilize these primary key expressions to improve query performance, and fix filtering issues with time ranges. Strengths1. Excellent Test Coverage
2. Well-Designed API
3. Performance Optimization
4. Backward Compatibility
Issues and Concerns1. Potential SQL Injection Risk (HIGH PRIORITY)Location: packages/common-utils/src/renderChartConfig.ts:560, 566 The code directly interpolates the function name and arguments into SQL. While toStartOf.formattedRemainingArgs comes from parsing timestamp expressions, it is directly injected into SQL without sanitization. Recommendation:
2. Error Handling Could Be More SpecificLocation: packages/common-utils/src/renderChartConfig.ts:521-523 The catch block uses console.log instead of a proper logger and silently swallows all errors. This makes debugging difficult. Recommendation: Use a proper logger with appropriate level and include context about the failure. 3. Regex Pattern Could Be More RobustLocation: packages/common-utils/src/utils.ts:598 The pattern (toStartOf\w+) matches any word characters, which could match invalid function names like toStartOfXYZ123. Recommendation: Use an explicit allowlist of valid ClickHouse toStartOf functions. 4. Type Safety IssueLocation: packages/common-utils/src/renderChartConfig.ts:511-515 No type guard or validation that primary_key exists and is a string before using it. Recommendation: Add proper type checking before destructuring. 5. Magic Strings in LogicLocation: packages/common-utils/src/utils.ts:645-649 Hard-coded function names like toUnixTimestamp and toDateTime make the code fragile and difficult to maintain. Recommendation: Extract to constants or a configuration object. Testing Recommendations
Security ReviewOverall Security Posture: MEDIUM RISK The main security concern is the direct string interpolation of parsed timestamp expressions into SQL queries. While the code uses the chSql template tag for parameterization in most places, the formattedRemainingArgs is directly inserted. Action Items:
ConclusionThis is a well-implemented feature with excellent test coverage that solves a real performance problem. The core logic is sound and the approach is pragmatic. However, there are security concerns around SQL injection that should be addressed before merging. The error handling and input validation could also be improved. Recommendation: Approve with required changes
The test coverage is exemplary and gives high confidence in the correctness of the implementation. |
36275d6
to
e983faa
Compare
e983faa
to
711c169
Compare
711c169
to
1e7f9bb
Compare
1e7f9bb
to
7fa8ec0
Compare
7fa8ec0
to
779f63c
Compare
} catch (e) { | ||
console.log('Failed to optimize timestampValueExpression', e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clicking through the app, I don't run into this error, and don't expect it to happen, but am adding this try catch here as an extra layer of protection against an error either in getTableMetadata or optimizeTimestampValueExpression, in which case we can always fall back to the raw timestampValueExpression
used before
779f63c
to
0646139
Compare
Code ReviewOverall: Well-structured PR with comprehensive test coverage. The implementation correctly optimizes timestamp filtering for ClickHouse primary keys. Critical Issues: None found Important Notes:
Optional Suggestions (non-blocking):
Recommendation: Approved - no blocking issues |
Closes HDX-2576
It is a common optimization to have a primary key like
toStartOfDay(Timestamp), ..., Timestamp
. This PR improves the experience when using such a primary key in the following ways:toStartOfDay(Timestamp)
andTimestamp
in this case, instead of justTimestamp
. This improves performance by better utilizing the primary index. Previously, this required a manual change to the source's Timestamp Column setting.toStartOfX
function to the right-hand-side of timestamp comparisons. So when filtering using an expression liketoStartOfDay(Timestamp)
, the generated SQL will have the conditiontoStartOfDay(Timestamp) >= toStartOfDay(<selected start time>) AND toStartOfDay(Timestamp) <= toStartOfDay(<selected end time>)
. This resolves an issue where some data would be incorrectly filtered out when filtering on such timestamp expressions (such as time ranges less than 1 minute).With this change, teams should no longer need to have multiple columns in their source timestamp column configuration. However, if they do, they will now have correct filtering.