-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reportInfiniteRecursion pass is excessively slow #457
Comments
Did some more investigating. In my grammar I have rules like these: select_columns
= head:column_list_item tail:(__ "," __ column_list_item)* trailing:(__ "," __ empty)? {
...
}
column_list_item
= expr:star_or_qualified_star kw:(__ EXCEPT __) columns:paren$list$column {
...
}
/ expr:star_or_qualified_star kw:(__ REPLACE __) columns:paren$list$alias$expr {
...
}
/ star_or_qualified_star
/ expr:expr alias:(__ alias)? {
...
}
expr = mysql_assign_expr / or_expr
mysql_assign_expr = ... // ends up referencing same stuff that or_expr references
or_expr = ... It takes from
Essentially it comes down to this main |
Yes, it definitely can be optimized. I think that the problem is because it tried to check each rule independently and as result can traverse the same nodes many times. The best strategy would be to introduce new pass that would calculate some useful properties of each AST node, which includes:
Then to find left recursion we need to find rules where |
What we're trying to do is prove that there is NOT a loop, which means we only need to find the first loop of any size. In this case, speed matters more than completeness. I bet if we took one O(n) pass over the rules to generate an adjacency matrix indexed by rule number, then did a depth-first search on each non-seen rule, we'd be able to prove there were no loops more quickly. |
In the same pass, we could prove that certain rules were not reachable by any of the allowedStartRules, marking them for removal. |
I wonder if #467 will help with this? |
I don't think this will help at all. Though I might misunderstand - I didn't test. The main problem concerns the case where no errors are found. Which in my case is like 99.9999% of the time. Frankly, I wouldn't actually mind if it were slow when errors are present. I assume that moving this to |
Nod. There are a couple of speedups for big grammars that I can add after 4.0, but I expect most of those to have minimal impact to this. Someone will have to help with the approach Mingun suggested above, I'm unlikely to get to that myself. |
I've been thinking of tackling this, but each time I look at it, the problem sort of overwhelms me as I can't seem to even fully wrap my head around the current implementation. Mainly me not being familiar with the code base :( |
I'm working with a pretty large grammar (6000 LOC) and running Peggy has been getting prohibitively slow, with the compilation taking more than a minute.
So I looked into what might be causing such a drag and discovered that the main culprit is this reportInfiniteRecursion pass. I threw some logging code to the Peggy compiler logic that loops over all the compiler passes, which produced the following output:
As I understand, this pass goes through all rules and checks that they aren't left-recursive. But to do the latter, it recursively goes through all the rules referenced by that rule. Which I suspect makes the code run in O(n^2) time.
I'm sure there is a way to optimize this. I can try to figure this out, but perhaps somebody who's more intimately familiar with Peggy can easily figure out a solution. It's a bit of a shame that this non-essential compiler pass is taking over 90% of the compilation time.
The text was updated successfully, but these errors were encountered: