perf: bench for `pratt` and `shunting yard` parsers by 39555 · Pull Request #620 · winnow-rs/winnow

39555 · 2024-11-14T21:04:13Z

The results are quite interesting. When parsing without recursive brackets, the performance is on par:

pratt/pratt             time:   [44.447 µs 44.870 µs 45.396 µs]
                        change: [-2.2112% -0.9080% +0.4095%] (p = 0.17 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe


pratt/shunting yard     time:   [45.557 µs 46.063 µs 46.590 µs]
                        change: [-0.0566% +1.2500% +2.5518%] (p = 0.05 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

But when parsing with recursive bracketed subexpressions, the performance of the shunting yard declines dramatically:

pratt/pratt             time:   [52.585 µs 53.006 µs 53.450 µs]
                        change: [+16.321% +17.706% +19.164%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild


pratt/shunting yard     time:   [72.964 µs 73.782 µs 74.596 µs]
                        change: [+58.952% +61.360% +63.907%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

linux x86_64 Intel(R) Core(TM) i5-10400F CPU @ 2.90GHz.

The same goes on macos M2.

Here are the corresponding flamegraphs (the parsing includes bracketed expressions):

shunting yard https://github.com/user-attachments/assets/ce464a1b-6251-4f8c-ab8f-82eb762871fd
pratt https://github.com/user-attachments/assets/9189ede4-bd31-494f-aeb6-ed42c31a4779)

I think it is related to frequent raw_vec::allocate followed by drop -> deallocate when parsing recursive operators such as (1).

epage · 2024-11-14T21:24:11Z

Oh, interesting.

If only there was a way to share an "allocator" between each recursive instance of the algorithm. If we provided a custom allocator, users could at least seed the allocator with an expected capacity. Granted, in that case, the user could specify smallvec and put things back on the stack, if the standard stack size is relatively small. The problem is implementing whatever trait we make for smallvec as the orphan rules prevent users from doing it and I'd prefer to keep external types outside of our API.

epage · 2024-11-14T21:56:47Z

I did try

#[derive(Default)]
enum Stack<T> {
    #[default]
    Empty,
    One(T),
    Many(Vec<T>),
}

and that saw no performance change despite my quick and dirt hacks showing the stack sizes were usually 0-2 items, up to 4 items for the benchmark.

epage · 2024-11-14T22:03:02Z

Capacity of 4 didn't make a difference

epage · 2024-11-14T22:27:16Z

If we had built-in parenthesis handling, then this would likely not be an issue.

epage · 2024-11-14T22:27:49Z

I'm leaning towards going the safety route. This level of nesting used in the benchmark is an extreme.

39555 · 2024-11-15T08:59:08Z

SmallVec::<[_; 4]> did the trick and made things slightly faster than Pratt when running without brackets, with comparable timings when brackets are used.

Report (numbers from M2)

pratt/pratt             time:   [26.492 µs 26.529 µs 26.582 µs]
                        change: [-4.4518% -4.1741% -3.8992%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe

pratt/shunting yard     time:   [23.692 µs 23.732 µs 23.776 µs]
                        change: [+0.6546% +0.9260% +1.1760%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Small expressions

Unfortunately, this is not the case for small expressions such as 1+(1+1*3)*1+(1+(1*4))*1*5 where Shunting Yard is consistently slower

Report (numbers from M2)


pratt/pratt             time:   [136.57 ns 136.96 ns 137.51 ns]
                        change: [-49.206% -49.000% -48.769%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

pratt/shunting yard     time:   [196.34 ns 196.97 ns 197.86 ns]
                        change: [-38.493% -38.304% -38.096%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Without SmallVec, the performance degrades significantly:

Report (numbers from M2)

pratt/pratt             time:   [136.90 ns 136.99 ns 137.11 ns]
                        change: [-0.8013% -0.3646% +0.0220%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

pratt/shunting yard     time:   [280.28 ns 280.48 ns 280.70 ns]
                        change: [+42.683% +43.255% +43.776%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 20 outliers among 100 measurements (20.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  13 (13.00%) high severe

And the performance is not much better without brackets:

Report (numbers from M2)

pratt/pratt             time:   [116.45 ns 116.54 ns 116.63 ns]
                        change: [-3.5605% -3.3230% -3.1042%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

pratt/shunting yard     time:   [164.99 ns 165.11 ns 165.23 ns]
                        change: [+2.2790% +2.6136% +2.9324%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

epage · 2024-11-15T18:59:33Z

I worry about pulling in a dependency just for one algorithm that most won't use, of coming up with an appropriate abstraction for a feature. That mostly leaves smallvec to how well we can make the stack customizable for people to use it.

btw something I forgot to call out in the discussion of trade offs in this thread is that recursion is a more trivial solution for no_std than using the heap.

Also, API wise, a pratt parser could return a struct "Pratt" that has a function on it for controlling recursion depth so we could do it on behalf of users.

Over at #618 (comment) you called out Double E Method.

Whats your opinion on

Pratt vs Shunting yard
Whether its worth experimenting with Double E Method for reducing the pains associated with Shunting yard

39555 · 2024-11-15T20:04:06Z

Whether its worth experimenting with Double E Method for reducing the pains associated with Shunting yard

That link, as I understand it, describes the same approach that I've implemented here and called it shunting yard. It's a modification of the original shunting yard but it can handle unary operators and can reject ill-formed expressions similar to the Pratt.

We already have two algorithms with similar interfaces. I'm planning to create a comprehensive example, similar to the JSON one. This will be useful to have anyway. Something like parsing function calls or ternary operators. This might show the advantages and disadvantages of each approach

Co-authored-by: Ed Page <eopage@gmail.com>

This feature was an overengineering based on suggestion "Why make our own trait" in winnow-rs#614 (comment)

works without it

…d be - based on review "Why allow non_snake_case?" in winnow-rs#614 (comment) - remove `allow_unused` based on "Whats getting unused?" winnow-rs#614 (comment)

until we find a satisfactory api based on winnow-rs#614 (comment) > "We are dumping a lot of stray types into combinator. The single-line summaries should make it very easy to tell they are related to precedence"

based on "Organizationally, O prefer the "top level" thing going first and then branching out from there. In this case, precedence is core." winnow-rs#614 (comment)

the api has an unsound problem. The `Parser` trait is implemented on the `&Operator` but inside `parse_next` a mutable ref and `ReffCell::borrow_mut` are used which can lead to potential problems. We can return to the API later. But for now lets keep only the essential algorithm and pass affix parsers as 3 separate entities Also add left_binding_power and right_binding_power to the operators based on winnow-rs#614 (comment)

I will write the documentation later

# Conflicts: # src/combinator/shunting_yard.rs

- require explicit `trace` for operators - fix associativity handling for infix operators: `1 + 2 + 3` should be `(1 + 2) + 3` and not `1 + (2 + 3)`

- ternary operator - function call - index

- fix failing tests related to the ternary operator and commas

winnow-rs#622 (comment)

# Conflicts: # src/combinator/mod.rs

updates from precedence.rs: - enum `Assoc` - associativity `Neither` - function pointers `fn()` instead of `dyn& Fn`

39555 · 2024-11-20T10:00:46Z

Update with feature-complete parsers

I pushed an updated benchmark inside the example using the bumpalo allocator.
Input: a = 2*-2 * (a ? 1 + 2 * 4 - --a.bar + 2 : 2) / ( &**foo.a->p! -+1) + 3^1 / 4 == 1 * (2 - 7 + 567 *12 /2) + 3*(1+2*( 45 /2))

Without bumpalo

pratt/pratt        
                        time:   [2.4638 µs 2.4669 µs 2.4703 µs]
                        change: [+0.4618% +0.6850% +0.9004%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

pratt/shunting_yard     
                        time:   [2.5804 µs 2.5814 µs 2.5829 µs]
                        change: [-0.6435% -0.3945% -0.1386%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  8 (8.00%) high severe

Reusing the bump arena

pratt/pratt             
                        time:   [1.6877 µs 1.6896 µs 1.6915 µs]
                        change: [-1.6832% -1.4469% -1.2231%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

pratt/shunting_yard     
                        time:   [1.9346 µs 1.9364 µs 1.9385 µs]
                        change: [-1.3229% -1.0127% -0.7373%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

A new bump arena each time

pratt/pratt_with_new_bump_each_time
                        time:   [1.7935 µs 1.7950 µs 1.7973 µs]
                        change: [-0.8932% -0.7001% -0.5126%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

spratt/shunting_yard_with_new_bump_each_time
                        time:   [2.0403 µs 2.0420 µs 2.0450 µs]
                        change: [-1.7721% -1.5693% -1.3480%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

39555 · 2024-11-20T10:55:04Z

It is hard to beat the application stack in terms of flexibility, platform support and speed. The recursion grows only with prefixes like -+--++-++--+1 and with right associative infix operators such as pow: 1 ** 2 ** 3 ** 4 ** 5 ** 7 ** 2 ** 1. I assume the user can track the recursion inside the parser and the fold function using the Stateful<> input, as thefold occurs only after all the operands have been parsed.

|i: &mut Stateful<...>| {
  if i.state.recursion > 99 {
       return Err(...)
  }
  i.state.recursion +=1;
  ...
}.value((Right(20), |i: _, a: _, b:_ | {
     i.state.recursion -= 1;
     a + b
}))

This commit: * Fixes errors due to winnow 0.7 migration * Adapts winnow-rs/winnow#620 to work outside winnow * Uncomments test cases For the migration steps, I've attached CHANGELOG.md line numbers for context. Use winnow-rs/winnow@73c6e05 for the version of CHANGELOG.md in winnow-rs/winnow, as the line numbers will inevitably change with time. The main migration steps are: * `PResult` replaced with `winnow::Result` (L153, L92-L98) * Use `winnow::Result` over `ModalResult` when `cut_err` isn't used * Swap `ErrMode::from_error_kind` to `ParserError::from_input` (L89) References: winnow-rs/winnow#618 References: winnow-rs/winnow#620 References: winnow-rs/winnow#622

I also edited some benchmark IDs to be consistent. For now, I'm fine having criterion write to the same ID. It's convenient for my CLI testing.

39555 added 2 commits November 12, 2024 02:37

feat: implement Pratt parser

fed8c90

PoC: Pratt parsing with shunting yard algorithm

6a488c2

39555 and others added 14 commits November 16, 2024 12:34

commit suggestion

ee4459d

Co-authored-by: Ed Page <eopage@gmail.com>

remove spaces from #[doc(alias = "...")]

4b1499d

remove UnaryOp and BinaryOp in favor of Fn

acf4577

This feature was an overengineering based on suggestion "Why make our own trait" in winnow-rs#614 (comment)

remove redundant trait impl

a816a1c

works without it

remove allow_unused, move allow(non_snake_case) to where it shoul…

2a80e65

…d be - based on review "Why allow non_snake_case?" in winnow-rs#614 (comment) - remove `allow_unused` based on "Whats getting unused?" winnow-rs#614 (comment)

stop dumping pratt into combinator namespace

29fe18d

until we find a satisfactory api based on winnow-rs#614 (comment) > "We are dumping a lot of stray types into combinator. The single-line summaries should make it very easy to tell they are related to precedence"

move important things to go first

5a4f4b4

based on "Organizationally, O prefer the "top level" thing going first and then branching out from there. In this case, precedence is core." winnow-rs#614 (comment)

remove wrong and long doc for now

0273a29

I will write the documentation later

rename unwind_operators_stack -> unwind_operators_stack_to

ebddbb8

Merge branch 'pratt' into pratt-perf

9139d58

Merge branch 'shunting-yard' into pratt-perf

086fda9

perf: benchmarks for pratt parsers

b31904b

# Conflicts: # src/combinator/shunting_yard.rs

rebase onto the latest changes from both parsers

efd1361

39555 force-pushed the pratt-perf branch from e33e695 to efd1361 Compare November 16, 2024 09:38

39555 added 5 commits November 16, 2024 15:45

refactor: make steps more distinct

46ea58d

perf: benchmarks for pratt parsers

73d8780

rebase onto the latest changes from both parsers

47542e2

fix: precedence for associativity, remove trace()

f218911

- require explicit `trace` for operators - fix associativity handling for infix operators: `1 + 2 + 3` should be `(1 + 2) + 3` and not `1 + (2 + 3)`

switch from &dyn Fn(O) -> O to fn(O) -> O

3d7ef41

39555 added 21 commits November 19, 2024 12:50

fix: switch to i64, fix precedence checking

6d88dff

example: pratt expression parser

8f18fc2

feat: complex postfix operators

a4ad844

- ternary operator - function call - index

pratt_example: operator closures return PResult

54cb315

test: add tests

d6da343

specify the parser start precedence

c1a8535

- fix failing tests related to the ternary operator and commas

style: fix indentation

a85291b

winnow-rs#622 (comment)

refactor: remove unnecessarily multispace0

39cc484

winnow-rs#622 (comment)

fix: failed tests

c52c10d

use Assoc enum. tests for associativity Neither

d3c3d0a

fix: switch to i64

b7b0629

tests ill-formed expressions

5e7fb65

update benchmark

7b6e3e0

PoC: Pratt parsing with shunting yard algorithm

63e30e1

rename unwind_operators_stack -> unwind_operators_stack_to

4ff9b25

# Conflicts: # src/combinator/mod.rs

refactor: make steps more distinct

7b82b0e

update shunting_yard

be02d0a

updates from precedence.rs: - enum `Assoc` - associativity `Neither` - function pointers `fn()` instead of `dyn& Fn`

Merge branch 'shunting-yard' into pratt-perf

9aae69c

upd

2a95e93

example: use bumpalo allocator

bc07708

bench: pratt vs shunting_yard with bumpalo

aa0a18f

39555 force-pushed the pratt-perf branch from 6a1c5cc to aa0a18f Compare November 20, 2024 09:59

39555 mentioned this pull request Dec 1, 2024

Consider using the arena allocator for the AST reubeno/brush#284

Open

ssmendon added a commit to ssmendon/rollers that referenced this pull request Jun 1, 2025

bench: port bumpalo bench from winnow-rs/winnow#620

0deed21

I also edited some benchmark IDs to be consistent. For now, I'm fine having criterion write to the same ID. It's convenient for my CLI testing.

lu-zero mentioned this pull request Jul 6, 2025

Pratt parser #798

Closed

ssmendon mentioned this pull request Jul 27, 2025

feat: implement Pratt parsing #804

Merged

epage closed this Nov 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

perf: bench for `pratt` and `shunting yard` parsers#620

perf: bench for `pratt` and `shunting yard` parsers#620
39555 wants to merge 47 commits intowinnow-rs:mainfrom
39555:pratt-perf

39555 commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

39555 commented Nov 15, 2024

Uh oh!

epage commented Nov 15, 2024

Uh oh!

39555 commented Nov 15, 2024 •

edited

Loading

Uh oh!

39555 commented Nov 20, 2024 •

edited

Loading

Uh oh!

39555 commented Nov 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

39555 commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

epage commented Nov 14, 2024

Uh oh!

39555 commented Nov 15, 2024

Small expressions

Uh oh!

epage commented Nov 15, 2024

Uh oh!

39555 commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

39555 commented Nov 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update with feature-complete parsers

Uh oh!

39555 commented Nov 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

39555 commented Nov 15, 2024 •

edited

Loading

39555 commented Nov 20, 2024 •

edited

Loading