Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump AST to OCaml 5.2.0 #514

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

patricoferris
Copy link
Collaborator

This is a draft PR to bump the AST of ppxlib to OCaml's 5.2 AST. The changes to the internal representation of functions are the main challenge, regardless of what we do we will need to patch quite a few ppxes that directly pattern-match on the constructors of the AST for functions.

There are a lot of changes to the pretty-printing too which I copied mostly from upstream -- it is unclear to me if this file is supposed to be copied across or updated with ppxlib specific changes ? @NathanReb or @pitag-ha maybe you could shed some light here. This is currently breaking this PR on older compiler versions but I'm not 100% sure what is causing that or where the check is ?

@patricoferris
Copy link
Collaborator Author

patricoferris commented Jul 29, 2024

Some code is not round-tripping here and I think it might be caused by (although I could be wrong) function arity issues. See the 4.12.1 build in the CI (there are other issues too I know). But it seems something like

type t = { x : int; y : int }
let f = fun z -> fun { x; y } -> x + y + z

and this roundtrip to

let f = fun z { x; y } -> x + y + z

Though I thought the latter is just syntactic sugar for the former, but then I haven't read the new arity change PR in full yet!

The actual code causing this is:

method include_infos :
  'a . ('a -> 'a) -> 'a include_infos -> 'a include_infos=
  fun _a ->
    fun { pincl_mod; pincl_loc; pincl_attributes } ->
    let pincl_mod = _a pincl_mod in
    let pincl_loc = self#location pincl_loc in
    let pincl_attributes = self#attributes pincl_attributes in
    { pincl_mod; pincl_loc; pincl_attributes }

Although it is unclear given the diff presented. Would it make more sense to write a structural equality checker ourselves that could output the problematic AST nodes when they don't compare correctly rather than using Poly.( <> ) + s-expression diff?

@NathanReb
Copy link
Collaborator

Having a proper comparison function for this and a proper printer would be great yes. I guess the printer is a bit more important in the short term as it would help us solve this particular issue.

We can probably work out a decent enough printer using Ast_traverse.lift.

@NathanReb
Copy link
Collaborator

From a quick look I think it's the other way around, our migrations are bugged in a way that will turn fun x y -> x + y into fun x -> fun y -> x + y.

I'm looking into testing and fixing this!

@patricoferris
Copy link
Collaborator Author

Another tricky corner case that has appeared during the 5.2 AST bump is that local opens in types make it very hard (perhaps impossible) to know if a type is recursive or not using really_recursive/type_is_recursive.

module M = struct
  type t = A
end

type t = M.(t)

When asked if type t = M.(t) is recursive or not are tests say yes. Of course, it could be the case that it actually is recursive. We just don't know without going to look at the module and working out if there's actually a t in there if it is or not.

Of course this was already the case with global opens too I suppose -- just thought I'd add that here.

@patricoferris patricoferris force-pushed the 5.2-bump branch 2 times, most recently from 6fdbb76 to d22f931 Compare September 9, 2024 09:13
@patricoferris patricoferris marked this pull request as ready for review September 14, 2024 09:34
Copy link
Collaborator

@NathanReb NathanReb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I think I need a bit of context on a few changes to better understand them but overall this looks solid.

@@ -228,6 +244,7 @@ module Patt = Make (struct
Some (fun loc -> ppat_any ~loc:{ loc with loc_ghost = true })

let attributes = Some (fun loc -> ppat_any ~loc:{ loc with loc_ghost = true })
let coalesce = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when we're generating patterns to match over anonymous functions?

E.g. what will this generate:

match expr with
| [%expr fun x y -> z] -> ...

If I understood the expression counterpart correctly, we generate Ppxlib.Ast_builder.Default.coalesce_arity node instead of node right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment here: #514 (comment)

src/ast_builder.ml Show resolved Hide resolved
src/ast_builder_intf.ml Show resolved Hide resolved
src/driver.ml Outdated Show resolved Hide resolved
src/gen/gen_ast_pattern.ml Show resolved Hide resolved
let func = [%expr fun x -> [%e f]] in
Format.asprintf "%a" Astlib.Pprintast.expression func
[%%expect{|
- : string = "fun x y z -> (x + y) + z"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this works because this triggers multiple calls to coalesce_arity, one for each [%expr ...].

Is there a way we could add a test for a single metaquot invocation? E.g. would [%expr fun x -> fun y -> fun z -> ...] raise the same result?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah we're going to run into trouble here. There are two distinct cases:

OCaml 5.1 or less

AST migration creates max arity functions for us regardless of how we build the function. Because we don't migrate downwards, the function remains max arity. There's not much we can do about this behaviour as we have no way to distinguish between [%expr fun x y -> x + y] and [%expr fun x -> fun y -> x + y]. So, no matter what, in this case we will get a maximum arity function.

OCaml 5.2 and above

It is possible for us to create functions that are not max arity and the AST migrations will not take care of this. Something like [%expr fun x -> fun y -> x + y] will not be max arity on it's own. In the 5.2+ parsetree this will be two single argument functions. In the case where this is used as an expression, metaquot will insert calls to coalesce_arity between each function. This is because the lift object in metaquot applies coalesce to every expression.

For expressions in patterns, we do not apply the function allowing users to distinguish between the two representations:

let f v = match v with
  | [%expr fun x -> fun y -> x + y] -> ()
  | [%expr fun x y -> x + y] -> ()
  | _ -> ()

However, I'm realising that on OCaml 5.1 or less, the migration will make these two cases identical and perhaps we should coalesce in this case to make the behaviour the same on all compilers? If someone wishes to distinguish between the two representations then they will have to use constructors directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this test: 908a820

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree it's a bit of tough case here but I'm slightly leaning towards having a unified behaviour here as it can otherwise produce semantically different code depending on the version of the compiler.

I think if one were to pattern-match over arity on functions with metaquot, they could still do something like this:

match expr with
| [%expr fun [%p? arg] -> [%e? body]] -> ...

in a recursive manner.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think?

The current, internal AST is 4.14. With this change it is now 5.2. The
most notable changes are the representation of functions and locally
opening modules for type definitions.

Signed-off-by: Patrick Ferris <[email protected]>
Signed-off-by: Patrick Ferris <[email protected]>
Signed-off-by: Patrick Ferris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants