Skip to content

Conversation

@MakinoharaShoko
Copy link
Member

From #555

Summary

This PR implements a new WebGAL parser based on parsing expression grammar (PEG). This has certain advantages over the current non-standard string-based parsing and enables the possibility of optimizations, advanced grammar, better error handling, etc. The new parser aims to be fully compatible with the old one.

Features (If Implemented Correctly)

  • 100% backward compatibility
  • More intelligent parsing behaviors
  • Error reporting and recovering from command strings with syntax errors

Backward compatibility

The new parser also generates a sentenceList with all fields available in the old parser. It may add some extra fields (e.g., recording errors and some utility command string information), but the result should work on the current engine.

New Behavior for Syntax Errors

Now, if the parser encounters syntax errors when parsing a command, it will stop at the character that contains the error and skip to the next line, but the parsed command will still be effective

For example, if a script contains

changeFigure:stand.png -left -next;
pixiInit:; // this line has syntax error. `pixiInit` should not have ':'
setAnimation:enter-from-left -target=fig-left -next;

it will then be parsed as

changeFigure:stand.png -left -next;
pixiInit
setAnimation:enter-from-left -target=fig-left -next;

which preserves the behavior of pixiInit.

Error Reporting and Recovery

The new parser supports error reporting and recovery. All errors in the script will be recorded in an errors field, which can be shown to the user after adapting the language server protocol (LSP) on this.

The aforementioned new behaviors on syntax errors ensures error recovery.

Changes to Source Code

The source code of the old parser is moved to packages/parser_legacy. The package name is changed to webgal-parser-legacy

The source code of the new parser is put in packages/parser. It is distributed under MPL-2.0 license (license file attached).

What's Next

  • Minify the generated parser. Currently, the generated PEG parser is ~200KB. For web deployment, this may become an issue. We can minify the compiled parser.js to ~100KB. Together with gzip on the web server, it may reduce the final data transmission.
  • Migrate post-processing logic. To ensure backward compatibility, the raw content of some parsed fields is preserved so that the post-processing still works. This results in unnecessary double parsing. We may need to migrate the post-processing logic to utilize the parsed fields directly.

@MakinoharaShoko, feel free to change the merge base branch :)

@MakinoharaShoko
Copy link
Member Author

After review, the following issues exist in the new parser:

  1. Failure to utilize ADD_NEXT_ARG_LIST to add the next parameter for statements that require it automatically.
  2. Incorrect timing for resource preloading. It should occur after parsing.
  3. The SceneParser lacks sufficient type hinting. The parsing results should conform to the IScene type.
  4. The parsing results lack the necessary fields sceneName and sceneUrl.
  5. Failure to call assetsSetter, resulting in script files not being converted to the correct paths as expected.

I have reverted the default exported SceneParser class in index.ts back to the old parser. I believe the following steps can be taken to align the new parser with the previous logic:

  1. Write a longer scene that covers as many statements and syntax variations as possible, and compare the differences in the parsing results between the new and old parsers. For this use case, the goal should be to achieve complete parity between the parsing results of the new and old parsers.
  2. For all test cases, first switch the parser used to the old parser. Then check whether failing test cases are due to issues within the old parser itself, or due to inconsistencies between the expected results of the test cases and the correct parsing results of the old parser.

I believe these temporary imperfections are not difficult to resolve. Our primary goal is to ensure consistency between the parsing results of the new and old parsers. Once this is achieved, we can leverage the enhanced error detection capabilities of the new parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants