Pratt example by 39555 · Pull Request #622 · winnow-rs/winnow

39555 · 2024-11-17T20:57:26Z

An example of the usage of the Pratt parser for parsing a weird cexpr.

The result of the parsing is a nicely formatted ast and the expression in prefix notation.

1 + 2 + 3:

ADD
  ADD
    VAL 1
    VAL 2
  VAL 3

(+ (+ 1 2) 3)

Parser Problems

Parsing any complex postfix operator a ? b : c, foo(), foo(1 + 2), a[1 + 2] cannot be done without parsing it inside the operand parser (which means breaking the prefix precedence). Maybe the api should provide the input inside the operator closures. ~~But I'm having strange lifetime troubles with it for now~~

EDIT: I made it by switching all callbacks to function pointers fn() ->. I had completely forgotten about it. Now the input is a first argument in closures.
Maybe closures should return an error instead of plain value to allow validation of the input e.g dereferencing the literal 1->foo or handling unbalanced delimiter in complex postfixes.

EDIT: all closures now return PResult. Looks quite ugly..
Concept of neither associativity. ~~I don't know yet how it works but the parser could potentially reject a == b == c somehow~~.

EDIT: added Assoc::Neither and tests. This should fail: a == b == c, a < b < c.

Co-authored-by: Ed Page <eopage@gmail.com>

This feature was an overengineering based on suggestion "Why make our own trait" in winnow-rs#614 (comment)

works without it

…d be - based on review "Why allow non_snake_case?" in winnow-rs#614 (comment) - remove `allow_unused` based on "Whats getting unused?" winnow-rs#614 (comment)

until we find a satisfactory api based on winnow-rs#614 (comment) > "We are dumping a lot of stray types into combinator. The single-line summaries should make it very easy to tell they are related to precedence"

based on "Organizationally, O prefer the "top level" thing going first and then branching out from there. In this case, precedence is core." winnow-rs#614 (comment)

the api has an unsound problem. The `Parser` trait is implemented on the `&Operator` but inside `parse_next` a mutable ref and `ReffCell::borrow_mut` are used which can lead to potential problems. We can return to the API later. But for now lets keep only the essential algorithm and pass affix parsers as 3 separate entities Also add left_binding_power and right_binding_power to the operators based on winnow-rs#614 (comment)

I will write the documentation later

- require explicit `trace` for operators - fix associativity handling for infix operators: `1 + 2 + 3` should be `(1 + 2) + 3` and not `1 + (2 + 3)`

epage · 2024-11-18T17:59:29Z

examples/pratt/parser.rs

+                dispatch! {any;
+                    '!' => empty.value((20, (|_: &mut _, a| Ok(Expr::Fac(Box::new(a)))) as _)),
+                    '?' => empty.value((3, (|i: &mut &str, cond| {
+                        let (left, right) = preceded(multispace0, cut_err(separated_pair(pratt_parser, delimited(multispace0, ':', multispace0), pratt_parser))).parse_next(i)?;


Having to put multispace0s in here means that we have to successfully parse more of the input before we find that it doesn't match, hurting performance. I assume the way to handle this is to lex into tokens and then run this parser on tokens which will have the multi-space taken care of.

epage · 2024-11-18T18:01:06Z

examples/pratt/parser.rs

+                            "ge" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::GreaterEqual(Box::new(a), Box::new(b)))) as _)),
+                            "lt" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::Less(Box::new(a), Box::new(b)))) as _)),
+                            "le" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::LessEqual(Box::new(a), Box::new(b)))) as _)),
+                        _ => fail


Is this missing indentation?

Yes. Thanks. Fixed. It seems like rustfmt is having a hard time formatting macros.

epage · 2024-11-18T18:03:22Z

examples/pratt/parser.rs

+            // .parse("1 + 2 * *4^7! + 6")
+            .parse("foo(1 + 2 + 3) + bar() ? 1 : 2")
+            .unwrap();
+        println!("{r}");


With snapbox we can do snapshot testing of the .to_string()

I just pushed a lot of tests. There is another api complication thing that currently fails. When invoking a recursive parser in a ternary operator, it should know its starting precedence. Consider:
a ? b : c, d
it should parse as (, (? a b c) d). But currently it parses as (? a b (, c d)). The second part after : doesn't know it's part of the ternary operator. 2 solutions are:

Allow users to provide a starting binding power e.g precedence(0, ...). The user would call precedence(ternary_precedence+1, ...) inside the ternary operator.

Require users to rebuild child parsers excluding operators with precedence lower than the current one.

Another API option is for fn precedence(...) -> Precedence (instead of impl Parser) and have a Precedence::initial_power or whatever we want to call it. This would also be a violation of the API guidelines of trailing functions only affecting the return value.

I changed the parser to allow specifying the starting precedence. Now all the tests pass. I will add more error related tests.

When all the missing parts work, we will see the full interface and can consider the most user-friendly design

winnow-rs#622 (comment)

- ternary operator - function call - index

- fix failing tests related to the ternary operator and commas

winnow-rs#622 (comment)

39555 · 2024-11-19T09:25:38Z

Well, the example is complete and all the features are there! I assume with #618 it would be the same. I may check it later.

src/combinator/mod.rs

 mod parser;
 mod sequence;

+pub mod precedence;


src/combinator/precedence.rs

+}
+
+#[derive(Debug, Clone, Copy)]
+pub enum Assoc {


src/combinator/precedence.rs

+
+#[derive(Debug, Clone, Copy)]
+pub enum Assoc {
+    Left(i64),


src/combinator/precedence.rs

+#[derive(Debug, Clone, Copy)]
+pub enum Assoc {
+    Left(i64),
+    Right(i64),


src/combinator/precedence.rs

+pub enum Assoc {
+    Left(i64),
+    Right(i64),
+    Neither(i64),


epage · 2024-11-19T19:59:50Z

@39555 I have to say, I am thoroughly impressed with the dedication you have put to this investigation, having

Implemented the initial recursive version
Implemented an iterative version out of my concern for stackoverflows
Implemented a full C expression parser with it to see what feature are missing, adding features that I'm not seeing in other libraries that provide a generic "pratt" parser

This commit: * Fixes errors due to winnow 0.7 migration * Adapts winnow-rs/winnow#620 to work outside winnow * Uncomments test cases For the migration steps, I've attached CHANGELOG.md line numbers for context. Use winnow-rs/winnow@73c6e05 for the version of CHANGELOG.md in winnow-rs/winnow, as the line numbers will inevitably change with time. The main migration steps are: * `PResult` replaced with `winnow::Result` (L153, L92-L98) * Use `winnow::Result` over `ModalResult` when `cut_err` isn't used * Swap `ErrMode::from_error_kind` to `ParserError::from_input` (L89) References: winnow-rs/winnow#618 References: winnow-rs/winnow#620 References: winnow-rs/winnow#622

A new feature 'unstable' publishes module internals for tests and examples only. This isn't a general-purpose winnow crate, so keep it private by default. Refs: winnow-rs/winnow#622

ssmendon · 2025-05-31T23:21:42Z

I don't know what the status is on this PR, but I was working on fixing this up for v0.7 for my own crate.

You can check it out at https://github.com/ssmendon/rollers/tree/main/crates/pratt if it's helpful at all.

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

@39555

This commit adds a new example `c_expression` for the Pratt parser added to the `expression` module. It also adds this example to the language special topic. This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon rebased it to the newest release of winnow and added extra documentation. Co-authored-by: Sohum Mendon <smendon@proton.me>

39555 and others added 14 commits November 12, 2024 02:37

feat: implement Pratt parser

fed8c90

commit suggestion

ee4459d

Co-authored-by: Ed Page <eopage@gmail.com>

remove spaces from #[doc(alias = "...")]

4b1499d

remove UnaryOp and BinaryOp in favor of Fn

acf4577

This feature was an overengineering based on suggestion "Why make our own trait" in winnow-rs#614 (comment)

remove redundant trait impl

a816a1c

works without it

remove allow_unused, move allow(non_snake_case) to where it shoul…

2a80e65

…d be - based on review "Why allow non_snake_case?" in winnow-rs#614 (comment) - remove `allow_unused` based on "Whats getting unused?" winnow-rs#614 (comment)

stop dumping pratt into combinator namespace

29fe18d

until we find a satisfactory api based on winnow-rs#614 (comment) > "We are dumping a lot of stray types into combinator. The single-line summaries should make it very easy to tell they are related to precedence"

move important things to go first

5a4f4b4

based on "Organizationally, O prefer the "top level" thing going first and then branching out from there. In this case, precedence is core." winnow-rs#614 (comment)

remove wrong and long doc for now

0273a29

I will write the documentation later

fix: precedence for associativity, remove trace()

f218911

- require explicit `trace` for operators - fix associativity handling for infix operators: `1 + 2 + 3` should be `(1 + 2) + 3` and not `1 + (2 + 3)`

switch from &dyn Fn(O) -> O to fn(O) -> O

3d7ef41

feat: pass Input into operator closures

a6cbc1a

add trace for tests parser

29b64fa

39555 force-pushed the pratt-example branch from b9e799d to c3e18a8 Compare November 17, 2024 22:05

feat: operator closures must return PResult

b31a3a3

39555 force-pushed the pratt-example branch from c3e18a8 to 4d9f2dc Compare November 18, 2024 16:40

epage mentioned this pull request Nov 18, 2024

Pratt parsing support #131

Closed

2 tasks

epage reviewed Nov 18, 2024

View reviewed changes

feat: allow the user to specify starting power

33c82f3

39555 force-pushed the pratt-example branch from d8c74b1 to 211f9de Compare November 18, 2024 20:37

39555 added a commit to 39555/winnow that referenced this pull request Nov 18, 2024

style: fix indentation

ca603d1

winnow-rs#622 (comment)

39555 added a commit to 39555/winnow that referenced this pull request Nov 18, 2024

refactor: remove unnecessarily multispace0

ee6c3b7

winnow-rs#622 (comment)

feat: enum Assoc for infix operators. Add Neither associativity

040dd85

39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024

style: fix indentation

7868cda

winnow-rs#622 (comment)

39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024

refactor: remove unnecessarily multispace0

70f5c6d

winnow-rs#622 (comment)

39555 force-pushed the pratt-example branch from 45317f9 to 5177b7d Compare November 19, 2024 08:37

39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024

style: fix indentation

7672302

winnow-rs#622 (comment)

39555 added 10 commits November 19, 2024 12:50

example: pratt expression parser

8f18fc2

feat: complex postfix operators

a4ad844

- ternary operator - function call - index

pratt_example: operator closures return PResult

54cb315

test: add tests

d6da343

specify the parser start precedence

c1a8535

- fix failing tests related to the ternary operator and commas

style: fix indentation

a85291b

winnow-rs#622 (comment)

refactor: remove unnecessarily multispace0

39cc484

winnow-rs#622 (comment)

fix: failed tests

c52c10d

use Assoc enum. tests for associativity Neither

d3c3d0a

fix: switch to i64

b7b0629

39555 force-pushed the pratt-example branch from 425cd2d to b7b0629 Compare November 19, 2024 08:50

tests ill-formed expressions

5e7fb65

39555 force-pushed the pratt-example branch from dd7e44c to 5e7fb65 Compare November 19, 2024 11:19

github-advanced-security bot found potential problems Nov 19, 2024

View reviewed changes

ssmendon added a commit to ssmendon/rollers that referenced this pull request May 31, 2025

bench(pratt): port bench from winnow-rs/winnow#622

d7ba7c3

epage mentioned this pull request Jul 7, 2025

Pratt parser #798

Closed

ssmendon added a commit to ssmendon/winnow that referenced this pull request Jul 26, 2025

chore: merging of winnow-rs#622

567830e

ssmendon mentioned this pull request Jul 27, 2025

feat: implement Pratt parsing #804

Merged

epage closed this Nov 27, 2025

Conversation

39555 commented Nov 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Parser Problems

Uh oh!

epage Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

epage Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

39555 Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

epage Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

39555 Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

epage Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

39555 Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

39555 Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

39555 commented Nov 19, 2024

Uh oh!

Check failure

Check failure

Check failure

Check failure

Check failure

epage commented Nov 19, 2024

Uh oh!

ssmendon commented May 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

39555 commented Nov 17, 2024 •

edited

Loading

39555 Nov 18, 2024 •

edited

Loading