Skip to content

Conversation

gueniai
Copy link
Collaborator

@gueniai gueniai commented Sep 11, 2025

What does this PR do?

Currently, if there's a parsing error (usually, these mean the output is unusable) we copy the input code into the output file. This leads to a confusing interaction, because there is no indication that the file is the same as the input, and the user is left wondering what happened.

This PR changes the output, so that the transpiled code, including all issues we found, are available in the "transpiled" file.

Tests

  • manually tested
  • added unit tests
  • added integration tests

Copy link

github-actions bot commented Sep 11, 2025

✅ 27/27 passed, 1 flaky, 1m20s total

Flaky tests:

  • 🤪 test_transpile_sql_file (14.47s)

Running from acceptance #2340

@gueniai gueniai marked this pull request as ready for review September 11, 2025 19:13
@gueniai gueniai requested a review from a team as a code owner September 11, 2025 19:13
@vil1
Copy link
Contributor

vil1 commented Sep 12, 2025

I don't think this is the proper way to go here.

When there is some parsing error, morpheus doesn't currently guarantee it won't "cut" some of its input.

For instance, when attempting to transpile

ADD SIGNATURE TO ProcForAlice BY CERTIFICATE csSelectT

which uses syntax that isn't currently covered by the grammar, morpheus will output the following

/* Parse error: unexpected extra input 'ADD' while parsing a sqlFile
expecting one of: @Local, End of batch, Identifier, Node ID, Select Statement, Statement, '(', ';', 'ARRAY', 'BULK', 'CALL', 'DBCC'...
'TO' was unexpected while parsing a sqlFile
expecting one of: End of batch, Select Statement, Statement, '(', 'BULK', 'CALL', 'COMMENT', 'CONTRACT', 'DBCC', 'DISABLE', 'ENABLE', 'GET'... */
-- FIXME: Unparsed input - ErrorNode encountered
;
/* SIGNATURE */
-- FIXME: SNOWFLAKE: The transpiler cannnot currently convert Execute body batch, but may be able to do so in the future
;

where most of the input query has disappeared (in this instance, we've lost everything but the SIGNATURE token).

What I suggest we do instead is to keep outputting context.source_code but we decorate it with comments listing the parsing errors encountered (the code for doing that should still be present somewhere).

@gueniai
Copy link
Collaborator Author

gueniai commented Sep 12, 2025

@vil1 I do like your idea more as the long-term solution, although we need to think about how we would handle the output from the BB converter. I still think this initial change is a step in the right direction. Outputting the input source with 0 context is very disorienting.

Copy link
Collaborator

@sundarshankar89 sundarshankar89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, may be an additional tests for this behavoiur will show an example of how morpheus behaves for failures.

Copy link
Contributor

@vil1 vil1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to approve once my little comment is addressed and https://github.com/databrickslabs/morpheus/pull/509 is merged


if any(err.kind == ErrorKind.PARSING for err in error_list):
output_code = context.source_code or ""
output_code = context.transpiled_code or ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should remove the whole if

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants