refactor: downgrade MLIR generic parser from syntax categories to closed syntax #364

alexkeizer · 2024-05-31T16:36:33Z

Currently almost all parts of the generic MLIR parser uses syntax categories, meaning these parts of the (generic!) grammar are in theory extensible.

In theory this can be very convenient: if a dialect wanted to customize only their operands it could choose to hook into just that part of the parser. In practice, I would expect that even the low level elements (such as mlir_op_operand) being extensible is an anti-pattern:

Such extensions would not be scoped to just that dialect, it would affect parsing of all generic MLIR syntax
The parser doesn't really seem to be written for such extensions anyway. I tried to come up with an example of extending just the operands, but the most obvious ways that came to mind didn't actually work (macro "&" n:num : mlir_op_operand => `(mlir_op_operand| % $n:num)).
Controlling which parts of the grammar are extensible would make it easier to eventually write an elaborator that outputs a meta-level AST, instead of a Lean expression of type AST, if we eventually decide to go that way.

Thus, I propose we keep mlir_op (and potentially mlir_type) as the top-level syntax category that can be hooked into, but make low-level elements of the grammar closed. This PR is still a draft, but it show-cases what this change would look like for the case of mlir_op_operand.

This matches Lean itself, where term and command are extensible, but things like def are closed syntax.
Also note that this is exactly the level at which the current LLVM pretty syntax operates: that only extends mlir_op and would still work as-is after this refactor.

…sed category

alexkeizer · 2024-05-31T16:42:03Z

@goens @bollu @tobiasgrosser I'm curious what you think about this refactor?

github-actions · 2024-05-31T16:44:10Z

Alive Statistics: 54 / 93 (39 failed)

tobiasgrosser · 2024-05-31T16:50:31Z

I think this is a great idea.

bollu · 2024-05-31T17:03:07Z

@alexkeizer thanks for thinking about this, I'm generally happy with this change in terms of API surface area.

I don't quite understand:

Controlling which parts of the grammar are extensible would make it easier to eventually write an elaborator that outputs a meta-level AST, instead of a Lean expression of type AST, if we eventually decide to go that way.

Could you elaborate on this? Is it that each syntax category is a construction for our meta-level-AST, and having many constructors makes this unmanagable?

github-actions · 2024-05-31T17:07:48Z

Alive Statistics: 54 / 93 (39 failed)

alexkeizer · 2024-05-31T17:16:08Z

@alexkeizer thanks for thinking about this, I'm generally happy with this change in terms of API surface area.

I don't quite understand:

Controlling which parts of the grammar are extensible would make it easier to eventually write an elaborator that outputs a meta-level AST, instead of a Lean expression of type AST, if we eventually decide to go that way.

Could you elaborate on this? Is it that each syntax category is a construction for our meta-level-AST, and having many constructors makes this unmanagable?

So what I'm thinking about is having a function like TSyntax `mlir_op_operand -> MetaM SSAVal . ¹ If mlir_op_operand is closed, we can just match on exactly the syntax we define, and nothing else is possible. If it's a category, though, we're promising the syntax is extensible, we would then have to somehow hook into macro expansion to make that promise actually work.
Of course, we could just say that, even though it's a category, it's not actually intended to be extensible and throw some kind of error if we encounter unexpected syntax, but I'd rather codify this.

With the caveat that SSAVal might have to become some MetaSSAVal that can possibly store a Lean expression for something, to support antiquotations ↩

Apparently closed `syntax` is namespaced, while syntax categories live in the root namespace

github-actions · 2024-05-31T18:59:00Z

Alive Statistics: 54 / 93 (39 failed)

github-actions · 2024-05-31T19:13:28Z

Alive Statistics: 54 / 93 (39 failed)

github-actions · 2024-05-31T19:15:51Z

Alive Statistics: 54 / 93 (39 failed)

Apparently the closed syntax breaks `opt`

goens · 2024-06-01T06:16:28Z

oh that's pretty cool! I didn't know about this difference (between closed syntax and extensible syntax categories).

I think it makes a lot of sense, thanks for the very thorough explanation @alexkeizer!

About the meta level ast, I also don't quite understand the point. Even if we choose to do things at elaboration time, what's the drawback of using our (current) AST data structure?

Or do you mean just that we can do (full) pattern matching on the closed syntax categories, so it's about the TSyntax mlirsoandso types? When talking about these self-made compilers from custom syntax, Mac once asked us why we wanted our own data structure and not just used TSyntax. Maybe using closed syntax is what we were missing for that, if I'm understanding you correctly

alexkeizer · 2024-06-03T12:22:50Z

About the meta level ast, I also don't quite understand the point. Even if we choose to do things at elaboration time, what's the drawback of using our (current) AST data structure?

So the main point is to correctly communicate what is and isn't supported.

With a closed syntax we know exactly what forms of syntax to expect when we pattern-match on the TSyntax. I don't think the exhaustive check knows about this, so it definitely still makes sense to parse the TSyntax into our AST datastructure, but this parsing would have to do a pattern match on the syntax to do so, and presumably throw an error when it sees syntax that doesn't match the option we've given (which, if we wrote the match correctly, should never happen, because we control what syntax gets parsed).

An option that totally works, is to have a syntax category and just only match on the syntax we've defined. Then, when the category is extended, and this new syntax is actually used, we just throw the error that says "well, this syntax category is not actually meant to be extended". This option would be strictly easiest to implement, but it's unsatisfying IMO.

If we have syntax categories, and would like to properly support extensions, then in the function that parses the TSyntax to produce our AST we'd have to hook into the macro expansion system before pattern matching. This seems excessively complicated.

A slight road bump: apparently the machinery in the opt executable that parses things at runtime works specifically with a category parser. I'm not sure how to change that to work with a ParserDescr (as obtained from the closed syntax). There's compileParserDescr, but this runs in ImportM. I might keep mlir_region as a category for now to avoid dealing with this

This reverts commit cc1112d.

refactor: downgrade mlir_op_operand from a syntax category to a clo…

5050a28

…sed category

refactor: close mlir_ops

1954068

alexkeizer added 3 commits May 31, 2024 19:44

refactor: close regions

cc1112d

refactor: remove unused mlir_bb syntax category

b262f4c

fix build by opening MLIR.EDSL namespace

6c1e3e3

Apparently closed `syntax` is namespaced, while syntax categories live in the root namespace

alexkeizer added 2 commits May 31, 2024 20:05

refactor: close affine_expr, _tuple, and _map syntax

a52b0f5

refactor: remove unused mlir_int syntax

d4868b2

WIP: turn mlir_region back into a category

b61194f

Apparently the closed syntax breaks `opt`

WIP

ba99b51

alexkeizer added 5 commits June 3, 2024 16:19

Merge branch 'main' into mlir-parser-refactor

cdf0ee1

Revert "refactor: close regions"

4571064

This reverts commit cc1112d.

refactor: put all EDSL tests in sections

4b0802b

document failing test case for region antiquotations

645fd86

Merge branch 'private-edsl-tests' into mlir-parser-refactor

cc4d389

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: downgrade MLIR generic parser from syntax categories to closed syntax #364

refactor: downgrade MLIR generic parser from syntax categories to closed syntax #364

alexkeizer commented May 31, 2024 •

edited

Loading

alexkeizer commented May 31, 2024

github-actions bot commented May 31, 2024

tobiasgrosser commented May 31, 2024

bollu commented May 31, 2024

github-actions bot commented May 31, 2024

alexkeizer commented May 31, 2024 •

edited

Loading

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

goens commented Jun 1, 2024

alexkeizer commented Jun 3, 2024

refactor: downgrade MLIR generic parser from syntax categories to closed syntax #364

Are you sure you want to change the base?

refactor: downgrade MLIR generic parser from syntax categories to closed syntax #364

Conversation

alexkeizer commented May 31, 2024 • edited Loading

alexkeizer commented May 31, 2024

github-actions bot commented May 31, 2024

tobiasgrosser commented May 31, 2024

bollu commented May 31, 2024

github-actions bot commented May 31, 2024

alexkeizer commented May 31, 2024 • edited Loading

Footnotes

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

goens commented Jun 1, 2024

alexkeizer commented Jun 3, 2024

alexkeizer commented May 31, 2024 •

edited

Loading

alexkeizer commented May 31, 2024 •

edited

Loading