Skip to content

Conversation

mateusz-klatt
Copy link
Member

@mateusz-klatt mateusz-klatt commented Sep 27, 2025

Summary

Fix MongoDB grammar parser to properly handle collection names with spaces and special characters by using standard Pure grammar utilities.

Changes

  • Add collection name handling with PureGrammarComposerUtility.convertIdentifier() for automatic quoting
  • Use PureGrammarParserUtility.fromGrammarString() for proper unescaping of quoted collection names
  • Add round-trip test for collection names with spaces
  • Ensures consistency with other Pure grammar modules for identifier handling

What type of PR is this?

Bug Fix

What does this PR do / why is it needed ?

Previously, MongoDB collection names with spaces were not properly parsed - the quotes were not removed during parsing, causing collection names like "Person With Spaces" to be stored with the quotes included instead of just Person With Spaces.

This fix ensures that:

  1. Collection names are automatically quoted when they contain spaces or special characters (composer)
  2. Quoted collection names are properly unescaped during parsing (parser)
  3. Consistent behavior with other Pure grammar modules

Which issue(s) this PR fixes:

Fixes #4133

Other notes for reviewers:

The changes use existing Pure grammar utilities that are already used consistently across other modules in the Legend Engine codebase. All existing tests continue to pass, and a new test case validates the fix for collection names with spaces.

Does this PR introduce a user-facing change?

Yes - collection names with spaces and special characters will now be properly handled in MongoDB grammar files.

Test Results

All existing tests pass, including the new test case testSingleCollectionMongoDBStoreGrammarWithSpaces() which validates proper handling of collection names with spaces.

Files Changed

  • MongoDBGrammarComposer.java - Updated to use PureGrammarComposerUtility.convertIdentifier()
  • MongoDBGrammarParser.java - Updated to use PureGrammarParserUtility.fromGrammarString()
  • TestMongoDBGrammarRoundTrip.java - Added test case for collection names with spaces

@mateusz-klatt mateusz-klatt requested a review from a team as a code owner September 27, 2025 20:28
@Copilot Copilot AI review requested due to automatic review settings September 27, 2025 20:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes MongoDB grammar parser to properly handle collection names with spaces and special characters by implementing standard Pure grammar utilities for identifier quoting and unquoting.

  • Replace custom collection name handling with PureGrammarComposerUtility.convertIdentifier() for automatic quoting during composition
  • Use PureGrammarParserUtility.fromGrammarString() for proper unescaping of quoted collection names during parsing
  • Add round-trip test case to validate collection names with spaces work correctly

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
MongoDBSchemaComposer.java Updated to use standard Pure grammar utility for identifier quoting
MongoDBSchemaParseTreeWalker.java Updated to use standard Pure grammar utility for string unescaping
TestMongoDBGrammarRoundTrip.java Added test case for collection names with spaces

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

github-actions bot commented Sep 27, 2025

Test Results

  1 010 files  +       61    1 010 suites  +61   2h 51m 20s ⏱️ + 1h 24m 49s
13 330 tests +  2 537  13 164 ✔️ +  2 536  166 💤 +1  0 ±0 
25 804 runs  +10 070  25 638 ✔️ +10 069  166 💤 +1  0 ±0 

Results for commit 538f0e6. ± Comparison against base commit 4177560.

♻️ This comment has been updated with latest results.

… utilities

- Replace custom collection name handling with PureGrammarComposerUtility.convertIdentifier() for automatic quoting
- Use PureGrammarParserUtility.fromGrammarString() for proper unescaping of quoted collection names
- Add round-trip test for collection names with spaces
- Ensures consistency with other Pure grammar modules for identifier handling
Fixes finos#4133
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MongoDB Grammar: Collection names with spaces are not properly unquoted during parsing
1 participant