-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the AST to make it easier to pattern match and remove ambiguities #24
Comments
Some side effects of this:
All of these are easily addressable. |
You can try these changes(with the exception of the {:sourceror, github: "doorgan/sourceror", branch: "rich_ast"} CC @marcandre |
Sigils were changed to |
@doorgan would it be wiser to wait for these changes to be applied before using |
Hi @msaraiva, I don't see these changes applied to the main branch anytime soon, as it's still too experimental and I realize it will take quite some time to find something that works fine for Sourceror use cases. When the time to merge them comes closer I'll let you know and I'll be happy to help you migrate the Surface scripts :) |
@doorgan sounds good! I didn't know these changes were that far away. I'll give Cheers. |
@msaraiva I initially intended to apply them in 0.9, but I started to deviate too much from the original goal and I want to have a clearer spec first. There's lot of nuances when it comes to having completely unambiguous node types and keeping the 3-tuple shape that I need to resolve 🙂. So maybe they will land in 0.10 or 0.11. |
This is derived from a discussion at #23. The tl;dr is that wrapping literals in
:__block__
works, but makes it cumbersome to pattern match against. Some other nodes could be changed as well, like sigils.The changes to be made would be such that the following syntaxes would be mapped as follows:
foo
->{:var, metadata, :foo}
400_000
->{:int, [token: "400_000" | metadata], 400000}
42.0
->{:float, [token: "42.0" | metadata], 42.0}
:foo
->{:atom, metadata, :foo}
[1, 2]
->{[], metadata, [1, 2]}
{1, 2}
->{:{}, metadata, [1. 2]}
~w[a b c]
->{{:sigil, "w"}, metadata, [segments, modifiers]}
{:sigil, meta, ["w", segments, modifiers]}
for sigils would be ambiguous with function calls, for example.~w/foo #{:bar}/a
andfoo("w", "foo #{:bar}", 'a')
already produce the very same AST, with the only exception that the sigil has a:delimiter
metadata that lets us recognize it as an actual sigil. Because AST nodes metadata order is not guaranteed, we can't just pattern match the metadata to assert it's a sigil. Hence the idea of using a tuple as the node type instead.The same goes for lists. If the list node is
{:list, meta, elements}
, then[1, 2, 3]
andlist(1, 2, 3)
would produce exactly the same AST, so I was thinking on using{[], meta, elements}
instead. I'm not very happy with the loss in readability there, but it would work.There are some opportunities to enrich the AST further, for example, the AST for
A.B.C.fun
is this:By leveraging the
:static_atoms_encoder
parser option, we could transform it to this:Which, of course, it a lot more verbose, but on the other hand it allows us to:
a) Have a more consistent AST, meaning we no longer have some places were atoms are wrapped and others were they're not
b) We can have more precise information about where exactly each alias segment is happening
For b we would still miss information about where exactly the dots are located, but at least we could accurately replace the inner segments.
The final change I had thought of is to make the parsing safe by default, meaning that parsing a file wont create new atoms. The downside is that the Formatter only works with atoms, so
Sourceror.to_string
would create new atoms anyways and we're back to square one.The text was updated successfully, but these errors were encountered: