Skip to content

Pratt example#622

Closed
39555 wants to merge 29 commits intowinnow-rs:mainfrom
39555:pratt-example
Closed

Pratt example#622
39555 wants to merge 29 commits intowinnow-rs:mainfrom
39555:pratt-example

Conversation

@39555
Copy link
Contributor

@39555 39555 commented Nov 17, 2024

An example of the usage of the Pratt parser for parsing a weird cexpr.

The result of the parsing is a nicely formatted ast and the expression in prefix notation.

1 + 2 + 3:

ADD
  ADD
    VAL 1
    VAL 2
  VAL 3

(+ (+ 1 2) 3)

Parser Problems

  • Parsing any complex postfix operator a ? b : c, foo(), foo(1 + 2), a[1 + 2] cannot be done without parsing it inside the operand parser (which means breaking the prefix precedence). Maybe the api should provide the input inside the operator closures. But I'm having strange lifetime troubles with it for now

    EDIT: I made it by switching all callbacks to function pointers fn() ->. I had completely forgotten about it. Now the input is a first argument in closures.

  • Maybe closures should return an error instead of plain value to allow validation of the input e.g dereferencing the literal 1->foo or handling unbalanced delimiter in complex postfixes.

    EDIT: all closures now return PResult. Looks quite ugly..

  • Concept of neither associativity. I don't know yet how it works but the parser could potentially reject a == b == c somehow.

    EDIT: added Assoc::Neither and tests. This should fail: a == b == c, a < b < c.

39555 and others added 14 commits November 12, 2024 02:37
Co-authored-by: Ed Page <eopage@gmail.com>
This feature was an overengineering

based on suggestion "Why make our own trait" in
winnow-rs#614 (comment)
…d be

- based on review "Why allow non_snake_case?"

in winnow-rs#614 (comment)

- remove `allow_unused` based on "Whats getting unused?"
winnow-rs#614 (comment)
until we find a satisfactory api

based on
winnow-rs#614 (comment)

> "We are dumping a lot of stray types into combinator. The single-line
summaries should make it very easy to tell they are related to
precedence"
based on "Organizationally, O prefer the "top level" thing going first
and then branching out from there. In this case, precedence is core."

winnow-rs#614 (comment)
the api has an unsound problem. The `Parser` trait is implemented on the
`&Operator` but inside `parse_next` a mutable ref and
`ReffCell::borrow_mut` are used which can lead to potential problems.

We can return to the API later. But for now lets keep only the essential
algorithm and pass affix parsers as 3 separate entities

Also add left_binding_power and right_binding_power to the operators
based on
winnow-rs#614 (comment)
I will write the documentation later
- require explicit `trace` for operators
- fix associativity handling for infix operators:
`1 + 2 + 3` should be `(1 + 2) + 3` and not `1 + (2 + 3)`
@epage epage mentioned this pull request Nov 18, 2024
2 tasks
dispatch! {any;
'!' => empty.value((20, (|_: &mut _, a| Ok(Expr::Fac(Box::new(a)))) as _)),
'?' => empty.value((3, (|i: &mut &str, cond| {
let (left, right) = preceded(multispace0, cut_err(separated_pair(pratt_parser, delimited(multispace0, ':', multispace0), pratt_parser))).parse_next(i)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to put multispace0s in here means that we have to successfully parse more of the input before we find that it doesn't match, hurting performance. I assume the way to handle this is to lex into tokens and then run this parser on tokens which will have the multi-space taken care of.

"ge" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::GreaterEqual(Box::new(a), Box::new(b)))) as _)),
"lt" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::Less(Box::new(a), Box::new(b)))) as _)),
"le" => empty.value((12, 13, (|_: &mut _, a, b| Ok(Expr::LessEqual(Box::new(a), Box::new(b)))) as _)),
_ => fail
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing indentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Thanks. Fixed. It seems like rustfmt is having a hard time formatting macros.

// .parse("1 + 2 * *4^7! + 6")
.parse("foo(1 + 2 + 3) + bar() ? 1 : 2")
.unwrap();
println!("{r}");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With snapbox we can do snapshot testing of the .to_string()

Copy link
Contributor Author

@39555 39555 Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a lot of tests. There is another api complication thing that currently fails. When invoking a recursive parser in a ternary operator, it should know its starting precedence. Consider:
a ? b : c, d
it should parse as (, (? a b c) d). But currently it parses as (? a b (, c d)). The second part after : doesn't know it's part of the ternary operator. 2 solutions are:

  • Allow users to provide a starting binding power e.g precedence(0, ...). The user would call precedence(ternary_precedence+1, ...) inside the ternary operator.
  • Require users to rebuild child parsers excluding operators with precedence lower than the current one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another API option is for fn precedence(...) -> Precedence (instead of impl Parser) and have a Precedence::initial_power or whatever we want to call it. This would also be a violation of the API guidelines of trailing functions only affecting the return value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the parser to allow specifying the starting precedence. Now all the tests pass. I will add more error related tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When all the missing parts work, we will see the full interface and can consider the most user-friendly design

39555 added a commit to 39555/winnow that referenced this pull request Nov 18, 2024
39555 added a commit to 39555/winnow that referenced this pull request Nov 18, 2024
39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024
39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024
39555 added a commit to 39555/winnow that referenced this pull request Nov 19, 2024
@39555
Copy link
Contributor Author

39555 commented Nov 19, 2024

Well, the example is complete and all the features are there! I assume with #618 it would be the same. I may check it later.

mod parser;
mod sequence;

pub mod precedence;

Check failure

Code scanning / clippy

missing documentation for a module

missing documentation for a module
}

#[derive(Debug, Clone, Copy)]
pub enum Assoc {

Check failure

Code scanning / clippy

missing documentation for an enum

missing documentation for an enum

#[derive(Debug, Clone, Copy)]
pub enum Assoc {
Left(i64),

Check failure

Code scanning / clippy

missing documentation for a variant

missing documentation for a variant
#[derive(Debug, Clone, Copy)]
pub enum Assoc {
Left(i64),
Right(i64),

Check failure

Code scanning / clippy

missing documentation for a variant

missing documentation for a variant
pub enum Assoc {
Left(i64),
Right(i64),
Neither(i64),

Check failure

Code scanning / clippy

missing documentation for a variant

missing documentation for a variant
@epage
Copy link
Collaborator

epage commented Nov 19, 2024

@39555 I have to say, I am thoroughly impressed with the dedication you have put to this investigation, having

  • Implemented the initial recursive version
  • Implemented an iterative version out of my concern for stackoverflows
  • Implemented a full C expression parser with it to see what feature are missing, adding features that I'm not seeing in other libraries that provide a generic "pratt" parser

ssmendon added a commit to ssmendon/rollers that referenced this pull request May 31, 2025
This commit:
* Fixes errors due to winnow 0.7 migration
* Adapts winnow-rs/winnow#620 to work outside winnow
* Uncomments test cases

For the migration steps, I've attached CHANGELOG.md line numbers for
context.

Use winnow-rs/winnow@73c6e05 for the
version of CHANGELOG.md in winnow-rs/winnow, as the line numbers will
inevitably change with time.

The main migration steps are:

  * `PResult` replaced with `winnow::Result` (L153, L92-L98)
  * Use `winnow::Result` over `ModalResult` when `cut_err` isn't used
  * Swap `ErrMode::from_error_kind` to `ParserError::from_input` (L89)

References: winnow-rs/winnow#618
References: winnow-rs/winnow#620
References: winnow-rs/winnow#622
ssmendon added a commit to ssmendon/rollers that referenced this pull request May 31, 2025
A new feature 'unstable' publishes module internals for tests and
examples only. This isn't a general-purpose winnow crate, so keep it
private by default.

Refs: winnow-rs/winnow#622
ssmendon added a commit to ssmendon/rollers that referenced this pull request May 31, 2025
@ssmendon
Copy link
Contributor

I don't know what the status is on this PR, but I was working on fixing this up for v0.7 for my own crate.

You can check it out at https://github.com/ssmendon/rollers/tree/main/crates/pratt if it's helpful at all.

@epage epage mentioned this pull request Jul 7, 2025
ssmendon added a commit to ssmendon/winnow that referenced this pull request Jul 26, 2025
ssmendon added a commit to ssmendon/winnow that referenced this pull request Jul 27, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
ssmendon added a commit to ssmendon/winnow that referenced this pull request Jul 27, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
ssmendon added a commit to ssmendon/winnow that referenced this pull request Jul 30, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
ssmendon added a commit to ssmendon/winnow that referenced this pull request Aug 9, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
lu-zero pushed a commit to lu-zero/winnow that referenced this pull request Oct 11, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
lu-zero pushed a commit to lu-zero/winnow that referenced this pull request Oct 11, 2025
This commit adds a new example `c_expression` for the Pratt parser added
to the `expression` module. It also adds this example to the language
special topic.

This example was primarily authored by @39555 in PR winnow-rs#622. @ssmendon
rebased it to the newest release of winnow and added extra
documentation.

Co-authored-by: Sohum Mendon <smendon@proton.me>
@epage epage closed this Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments