-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Enable case-senstive identifer support for Prometheus connector #25946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Enable case-senstive identifer support for Prometheus connector #25946
Conversation
|
357620a
to
e175d9a
Compare
Reviewer's GuideIntroduce an optional case-sensitive metric name matching mode in the Prometheus connector by adding a new configuration property and updating metadata and client lookups to normalize or preserve identifiers accordingly, with extended test coverage for both modes. Sequence diagram for table lookup with case-sensitive name matchingsequenceDiagram
participant User
participant "PrometheusMetadata"
participant "PrometheusClient"
participant "PrometheusConnectorConfig"
User->>PrometheusMetadata: getTableHandle(session, tableName)
PrometheusMetadata->>PrometheusConnectorConfig: isCaseSensitiveNameMatching()
PrometheusMetadata->>PrometheusMetadata: normalizeIdentifier(session, tableName)
PrometheusMetadata->>PrometheusClient: getTable(schema, normalizedTableName)
PrometheusClient->>PrometheusConnectorConfig: isCaseSensitiveNameMatching()
PrometheusClient->>PrometheusClient: findActualTableName(tableNames, normalizedTableName)
PrometheusClient-->>PrometheusMetadata: PrometheusTable (actualTableName)
PrometheusMetadata-->>User: PrometheusTableHandle (actualTableName)
Class diagram for updated Prometheus connector configuration and metadata handlingclassDiagram
class PrometheusConnectorConfig {
+URI getPrometheusURI()
+boolean isCaseSensitiveNameMatching()
+PrometheusConnectorConfig setCaseSensitiveNameMatching(boolean)
-boolean caseSensitiveNameMatching
}
class PrometheusMetadata {
+String normalizeIdentifier(ConnectorSession, String)
-boolean caseSensitiveNameMatching
+PrometheusMetadata(PrometheusClient, PrometheusConnectorConfig)
}
class PrometheusClient {
+PrometheusTable getTable(String, String)
-String findActualTableName(List<String>, String)
}
PrometheusMetadata --> PrometheusClient
PrometheusMetadata --> PrometheusConnectorConfig
PrometheusClient --> PrometheusConnectorConfig
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In case‐insensitive mode, if two metrics differ only by case (e.g., "HTTP_Requests" vs "http_requests"), findActualTableName will arbitrarily pick the first match—consider validating or erroring on ambiguous duplicates to avoid silent mis‐resolution.
- The TestPrometheusIntegrationMixedCase class is very large—consider extracting shared setup into helper methods or splitting it into smaller, focused test classes for better maintainability.
- There’s duplicate case‐normalization logic in PrometheusMetadata, PrometheusClient, and the mock client—consider centralizing this into a shared utility to prevent drift and simplify future updates.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In case‐insensitive mode, if two metrics differ only by case (e.g., "HTTP_Requests" vs "http_requests"), findActualTableName will arbitrarily pick the first match—consider validating or erroring on ambiguous duplicates to avoid silent mis‐resolution.
- The TestPrometheusIntegrationMixedCase class is very large—consider extracting shared setup into helper methods or splitting it into smaller, focused test classes for better maintainability.
- There’s duplicate case‐normalization logic in PrometheusMetadata, PrometheusClient, and the mock client—consider centralizing this into a shared utility to prevent drift and simplify future updates.
## Individual Comments
### Comment 1
<location> `presto-prometheus/src/main/java/com/facebook/presto/plugin/prometheus/PrometheusConnectorConfig.java:43` </location>
<code_context>
private String trustStorePath;
private String truststorePassword;
private boolean verifyHostName;
+ private boolean caseSensitiveNameMatching;
@NotNull
</code_context>
<issue_to_address>
Default value for caseSensitiveNameMatching is not set.
Explicitly initialize caseSensitiveNameMatching to false or add documentation to clarify its default value for maintainability.
</issue_to_address>
### Comment 2
<location> `presto-prometheus/src/test/java/com/facebook/presto/plugin/prometheus/TestPrometheusIntegrationMixedCase.java:196` </location>
<code_context>
+ mockClient.setFetchUriOverride("cpu_usage", mockClient.createEmptyResponse());
+ assertQueryFails(caseSensitiveSession, "SELECT * FROM \"cpu_usage\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.cpu_usage does not exist");
+
+ // Test that uppercase "UP" fails in case-sensitive mode
+ // Make sure we don't have an uppercase "UP" metric in the table names
+ mockClient.removeMetric("UP");
+ mockClient.setTableOverride("UP", null);
+ mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse());
+ assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist");
+ }
+ @Test
</code_context>
<issue_to_address>
Consider adding a test for ambiguous metric names (e.g., both 'up' and 'UP' present) in case-insensitive mode.
Add a test with both 'up' and 'UP' in the mock client under case-insensitive mode to verify consistent and predictable metric resolution.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
+ // Test that uppercase "UP" fails in case-sensitive mode
+ // Make sure we don't have an uppercase "UP" metric in the table names
+ mockClient.removeMetric("UP");
+ mockClient.setTableOverride("UP", null);
+ mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse());
+ assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist");
+ }
+ @Test
=======
+ // Test that uppercase "UP" fails in case-sensitive mode
+ // Make sure we don't have an uppercase "UP" metric in the table names
+ mockClient.removeMetric("UP");
+ mockClient.setTableOverride("UP", null);
+ mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse());
+ assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist");
+
+ // Test ambiguous metric names in case-insensitive mode
+ mockClient.setCaseSensitiveNameMatching(false);
+ // Add both "up" and "UP" metrics
+ mockClient.addMetric("up", mockClient.createMetricResponse("up", 1.0));
+ mockClient.addMetric("UP", mockClient.createMetricResponse("UP", 2.0));
+ // Query for "up" in case-insensitive mode
+ MaterializedResult upResult = getQueryRunner().execute(getSession(), "SELECT value FROM \"up\"");
+ // Assert that the result is either 1.0 or 2.0, depending on resolution strategy
+ // (Assume the implementation picks the first or a deterministic one)
+ Set<Object> upValues = upResult.getOnlyColumnAsSet();
+ assertTrue(upValues.contains(1.0) || upValues.contains(2.0), "Expected value from either 'up' or 'UP' metric");
+ // Clean up
+ mockClient.removeMetric("up");
+ mockClient.removeMetric("UP");
+ mockClient.setCaseSensitiveNameMatching(true);
+ }
+ @Test
>>>>>>> REPLACE
</suggested_fix>
### Comment 3
<location> `presto-prometheus/src/test/java/com/facebook/presto/plugin/prometheus/TestPrometheusIntegrationMixedCase.java:249` </location>
<code_context>
+ // Test with http_requests_total - all case variations should return the same result
</code_context>
<issue_to_address>
Missing test for querying a non-existent metric in both case-sensitive and case-insensitive modes.
Add a test that queries a non-existent metric in both modes and asserts the correct error is thrown.
Suggested implementation:
```java
// Test with api_calls - both lowercase and uppercase should work
MaterializedResult lowerCaseApi = getQueryRunner().execute(getSession(), "SELECT value FROM \"api_calls\"");
// Test querying a non-existent metric in case-insensitive mode
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"non_existent_metric\"");
fail("Expected exception for non-existent metric in case-insensitive mode");
} catch (RuntimeException e) {
// Assert that the error message contains information about the missing metric
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"NON_EXISTENT_METRIC\"");
fail("Expected exception for non-existent metric in case-insensitive mode (uppercase)");
} catch (RuntimeException e) {
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"Non_Existent_Metric\"");
fail("Expected exception for non-existent metric in case-insensitive mode (mixed case)");
} catch (RuntimeException e) {
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
```
If your test suite supports case-sensitive mode, you should also add similar tests for that mode.
You may need to adjust the exception type if your code throws a more specific exception for missing metrics.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
@@ -40,6 +40,7 @@ public class PrometheusConnectorConfig | |||
private String trustStorePath; | |||
private String truststorePassword; | |||
private boolean verifyHostName; | |||
private boolean caseSensitiveNameMatching; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Default value for caseSensitiveNameMatching is not set.
Explicitly initialize caseSensitiveNameMatching to false or add documentation to clarify its default value for maintainability.
// Test that uppercase "UP" fails in case-sensitive mode | ||
// Make sure we don't have an uppercase "UP" metric in the table names | ||
mockClient.removeMetric("UP"); | ||
mockClient.setTableOverride("UP", null); | ||
mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse()); | ||
assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist"); | ||
} | ||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding a test for ambiguous metric names (e.g., both 'up' and 'UP' present) in case-insensitive mode.
Add a test with both 'up' and 'UP' in the mock client under case-insensitive mode to verify consistent and predictable metric resolution.
// Test that uppercase "UP" fails in case-sensitive mode | |
// Make sure we don't have an uppercase "UP" metric in the table names | |
mockClient.removeMetric("UP"); | |
mockClient.setTableOverride("UP", null); | |
mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse()); | |
assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist"); | |
} | |
@Test | |
+ // Test that uppercase "UP" fails in case-sensitive mode | |
+ // Make sure we don't have an uppercase "UP" metric in the table names | |
+ mockClient.removeMetric("UP"); | |
+ mockClient.setTableOverride("UP", null); | |
+ mockClient.setFetchUriOverride("UP", mockClient.createEmptyResponse()); | |
+ assertQueryFails(caseSensitiveSession, "SELECT * FROM \"UP\" WHERE value > 0", ".*Table prometheus_case_sensitive.default.UP does not exist"); | |
+ | |
+ // Test ambiguous metric names in case-insensitive mode | |
+ mockClient.setCaseSensitiveNameMatching(false); | |
+ // Add both "up" and "UP" metrics | |
+ mockClient.addMetric("up", mockClient.createMetricResponse("up", 1.0)); | |
+ mockClient.addMetric("UP", mockClient.createMetricResponse("UP", 2.0)); | |
+ // Query for "up" in case-insensitive mode | |
+ MaterializedResult upResult = getQueryRunner().execute(getSession(), "SELECT value FROM \"up\""); | |
+ // Assert that the result is either 1.0 or 2.0, depending on resolution strategy | |
+ // (Assume the implementation picks the first or a deterministic one) | |
+ Set<Object> upValues = upResult.getOnlyColumnAsSet(); | |
+ assertTrue(upValues.contains(1.0) || upValues.contains(2.0), "Expected value from either 'up' or 'UP' metric"); | |
+ // Clean up | |
+ mockClient.removeMetric("up"); | |
+ mockClient.removeMetric("UP"); | |
+ mockClient.setCaseSensitiveNameMatching(true); | |
+ } | |
+ @Test |
// Test with http_requests_total - all case variations should return the same result | ||
MaterializedResult lowerCaseHttp = getQueryRunner().execute(getSession(), "SELECT value FROM \"http_requests_total\""); | ||
MaterializedResult upperCaseHttp = getQueryRunner().execute(getSession(), "SELECT value FROM \"HTTP_REQUESTS_TOTAL\""); | ||
MaterializedResult mixedCaseHttp = getQueryRunner().execute(getSession(), "SELECT value FROM \"Http_Requests_Total\""); | ||
// All queries should return the same result | ||
assertEquals(lowerCaseHttp.getOnlyColumnAsSet().iterator().next(), 10.0); | ||
assertEquals(lowerCaseHttp.getOnlyColumnAsSet().iterator().next(), upperCaseHttp.getOnlyColumnAsSet().iterator().next()); | ||
assertEquals(lowerCaseHttp.getOnlyColumnAsSet().iterator().next(), mixedCaseHttp.getOnlyColumnAsSet().iterator().next()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing test for querying a non-existent metric in both case-sensitive and case-insensitive modes.
Add a test that queries a non-existent metric in both modes and asserts the correct error is thrown.
Suggested implementation:
// Test with api_calls - both lowercase and uppercase should work
MaterializedResult lowerCaseApi = getQueryRunner().execute(getSession(), "SELECT value FROM \"api_calls\"");
// Test querying a non-existent metric in case-insensitive mode
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"non_existent_metric\"");
fail("Expected exception for non-existent metric in case-insensitive mode");
} catch (RuntimeException e) {
// Assert that the error message contains information about the missing metric
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"NON_EXISTENT_METRIC\"");
fail("Expected exception for non-existent metric in case-insensitive mode (uppercase)");
} catch (RuntimeException e) {
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
try {
getQueryRunner().execute(getSession(), "SELECT value FROM \"Non_Existent_Metric\"");
fail("Expected exception for non-existent metric in case-insensitive mode (mixed case)");
} catch (RuntimeException e) {
assertTrue(e.getMessage().toLowerCase().contains("non_existent_metric"));
}
If your test suite supports case-sensitive mode, you should also add similar tests for that mode.
You may need to adjust the exception type if your code throws a more specific exception for missing metrics.
e175d9a
to
b39b13b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pulled branch, made local doc build to review. Thanks!
Description
Added catalog-level config to support
case-sensitive-name-matching
This PR implements case sensitivity handling for Prometheus metric (table) names:
case-sensitive-name-matching
configuration propertyWhen case-sensitive-name-matching is set to true:
HTTP_Requests
andhttp_requests
can exist as separate metricsWhen
case-sensitive-name-matching
is set to false (default for backward compatibility):Motivation and Context
Prometheus has a fundamentally different data model compared to traditional relational databases:
Table = Metric Name: In Prometheus, each table corresponds to a single metric name. These metric names come directly from the Prometheus API and can have mixed case (e.g., http_requests_total, HTTP_Requests_Total, etc.).
Fixed Column Schema: As seen in
PrometheusTable.java
, every Prometheus table has exactly the same three columns:These columns are always named "labels", "timestamp", and "value" - they're not user-defined and don't come from Prometheus.
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.