Skip to content

Conversation

@ronantakizawa
Copy link
Contributor

@ronantakizawa ronantakizawa commented Aug 24, 2025

Resolves issue #473.

This PR implements a configurable maximum parser recursion depth limit to prevent stack overflow from deeply
nested Jsonnet structures, following the approach from google/jsonnet#1230

Changes

  • Added maxParserRecursionDepth: Int = 1000 parameter to Settings class
  • Added --max-parser-recursion-depth CLI argument to allow runtime configuration
  • Updated Settings flow: CLI → Config → Settings → Interpreter → CachedResolver → Parser

Before:

  • Parsing extremely deeply nested structures (e.g., [[[...1000+ levels...]]]) could cause native stack
    overflow and crash the interpreter.

After:

  • Parser gracefully throws java.lang.Exception: Parsing exceeded maximum recursion depth of 1000 instead of
    crashing.
  • Users can now configure the limit: sjsonnet --max-parser-recursion-depth 2000 deeply_nested.jsonnet
  • Allows tuning for specific use cases while maintaining safe defaults

@He-Pin
Copy link
Contributor

He-Pin commented Aug 24, 2025

Better we can set this with the setting

@ronantakizawa
Copy link
Contributor Author

@He-Pin I implemented the fix, but let me know if I should remove some uses of checkParseDepth for performance purposes (Checking parse depth too much can slow down performance.)

@stephenamar-db stephenamar-db self-requested a review August 27, 2025 03:06
@He-Pin
Copy link
Contributor

He-Pin commented Aug 27, 2025

@ronantakizawa I think so , we should not slow the parsing much.

@ronantakizawa
Copy link
Contributor Author

ronantakizawa commented Aug 27, 2025

@He-Pin @stephenamar-db implemented feedback.
checkParseDepth is now only used for:

  1. arr method (line 201) - protects array parsing recursion
  2. expr method (line 265) - protects main expression parsing recursion
  3. climb method (line 270) - protects binary operator precedence climbing recursion
  4. objinside method (line 479) - protects object parsing recursion

Previously it was used for 27 other methods.

  1. compSuffix - comprehension suffix parsing
  2. arrBody - array body parsing
  3. assertExpr - assertion expression parsing
  4. function - function definition parsing
  5. ifElse - if-else expression parsing
  6. localExpr - local expression parsing
  7. expr1 - expression level 1 parsing
  8. exprSuffix2 - expression suffix parsing
  9. local - local binding parsing
  10. importStr - import string parsing
  11. importBin - import binary parsing
  12. import - import parsing
  13. error - error expression parsing
  14. importExpr - import expression parsing
  15. unaryOpExpr - unary operator parsing
  16. expr2 - expression level 2 parsing
  17. member - object member parsing
  18. field - object field parsing
  19. objlocal - object local parsing
  20. compspec - comprehension spec parsing
  21. forspec - for specification parsing
  22. ifspec - if specification parsing
  23. fieldname - field name parsing
  24. assertStmt - assertion statement parsing
  25. bind - binding parsing
  26. args - arguments parsing
  27. params - parameters parsing

@stephenamar-db
Copy link
Collaborator

Hi @lihaoyi, if I have time of course, I would appreciate if you could take a this PR. I'm trying to figure out if this is right way to implement maximum parse recursion depth with fastparse.

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 27, 2025

@stephenamar-db the implementation looks fine. FastParse "rules" are just methods, so passing around a currentDepth: Int is a totally reasonable thing to do. An alternative would be to put the data in the ParsingRun#misc dictionary, or just use a ThreadLocal, both of which would reduce the verbosity but trade off explicitness. Either approach works fine, so the current code works if you don't have a preference.

Only comment I would give is that I think a single checkParseDepth call inside def objinside is unnecessary, as any recursion would have to go through def expr anyway. But I guess it depends on how fine-grained you want checkParseDepth to be. In theory you should check every time you increment currentDepth + 1 if you want overshoots to be caught precisely, but if all you want to do is avoid infinite recursion then only checking in def expr` is enough

@He-Pin
Copy link
Contributor

He-Pin commented Aug 27, 2025

@ronantakizawa climb method (line 270) - protects binary operator precedence climbing recursion

This may not needed, I think the we should only limit where (), {}, [] nesting, not sure about the long binary operations

@ronantakizawa
Copy link
Contributor Author

@He-Pin @lihaoyi implemented fixes!

Now only arr and expr has checkParseDepth

@stephenamar-db stephenamar-db merged commit f684d27 into databricks:master Aug 28, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants