Open
Conversation
Three issues fixed:
1. Line comments (-- ...) were stored as bare text in the tokenizer,
causing them to leak as executable SQL during generation. Now
normalized to block comment format (/* ... */) at scan time,
matching the storage format already used for block comments.
2. Identifier trailing_comments were discarded in
expect_identifier_or_alias_keyword_with_quoted(), which created
Identifier { trailing_comments: Vec::new() } instead of
preserving the token's trailing_comments.
3. Comments between ( and SELECT in subqueries were lost because
the LParen token's trailing_comments were discarded. Now captured
and spliced into the inner Select's leading_comments in both
parse_paren() and parse_table_expression().
e0a46a5 to
9c186aa
Compare
Owner
|
Thanks, I‘ll have a look! |
The LParen trailing_comments were dropped when parsing parenthesized subqueries in expressions (e.g. WHERE a >= (/* hint */ SELECT ...)). Splice them into the inner Select's leading_comments, matching the existing fix for FROM-clause and CTE subqueries.
transform_recursive silently skipped Merge nodes via the catch-all, meaning transforms (and any AST walks using transform_map) never descended into MERGE ... ON / USING / WHEN clauses.
Owner
|
Please run all tests again locally, the CI reports a failing test: failures:
---- tokens::tests::test_comments stdout ----
thread 'tokens::tests::test_comments' (3081) panicked at crates/polyglot-sql/src/tokens.rs:2765:9:
assertion `left == right` failed
left: "/* comment */"
right: "comment"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
tokens::tests::test_comments
test result: FAILED. 719 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.32s
error: test failed, to rerun pass `-p polyglot-sql --lib`All |
Owner
|
Please retest with the current version |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There were a few small bugs with comments going missing and one that causes corruption of the SQL (a trailing subquery comment).
I imagine there are more comments going missing, which I also image is always unexpected behaviour, however I just covered a handful of comment locations that are in common locations or may be used for hint directions ala Spark etc.
Three issues fixed:
Line comments (-- ...) were stored as bare text in the tokenizer, causing them to leak as executable SQL during generation. Now normalized to block comment format (/* ... */) at scan time, matching the storage format already used for block comments.
Identifier trailing_comments were discarded in expect_identifier_or_alias_keyword_with_quoted(), which created Identifier { trailing_comments: Vec::new() } instead of preserving the token's trailing_comments.
Comments between ( and SELECT in subqueries were lost because the LParen token's trailing_comments were discarded. Now captured and spliced into the inner Select's leading_comments in both parse_paren() and parse_table_expression().