-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a function to extract patches from a changed tree #14
Comments
Producing patches like this has some downsides as well. For example, inserting a sibling node could just produce a patch after that node instead of producing a patch for the whole parent node. |
In the Ruby world, what we ended up doing is organizing string patches as a tree of nested ranges with replacements and/or insertions at either end. Crossing ranges (e.g. 1..10 and 5..15) generate errors. A more detailed explanation is in this PR. Note: I've just started to look at sourceror today, looks super intersesting and I'm a newcomer to Elixir. I wrote the rewriter for the Ruby |
@marcandre - welcome on board - another AST guru! Given your RuboCop experience, you might be interested in the work we are doing now to integrate with Credo. This integration will be a big project - lessons learned and feedback are welcome. @doorgan - in Rfx we've been fiddling with approaches for diff and patch. It would probably be good if Rfx and Sourceror used identical or compatible diff/patch strategies. |
Hi @marcandre! Your tree approach looks really interesting, I'll keep digging into it, thanks for the link! My concern in this issue is that I'm looking for a way to infer the patches to be generated based on a modified tree, but I see the approach from the
The Elixir AST is a really simple data structure, I have an article that you can use for reference, and I also wrote an article that describes Sourceror's approach. In essence(and specially in Sourceror's case) an AST node is a tuple with 3 elements like Macro.prewalk(ast, fn {:__block__, _metadata, [integer]} when is_integer(integer) -> IO.puts("Found an integer!") end) Now, I see that RuboCop's NodePath has some operators like I believe a port of NodePath should be achievable using that Zipper API. I've been entertaining similar ideas while looking at projects such as grasp, but I don't have anything concrete. I'm happy to discuss more about how this could be done and Elixir in general in this repo's discussions section if you're interested 🙂 |
@andyl of course! Patches in this context are meant to be the textual changes derived from an change in the AST. In Rfx, diffs and patches are performed over the original and the modified text, is this correct? What I'm exploring in this issue is a case like this one: if not allowed? do
# ...
end A function can find the This modified string is what, to my understanding, would be used in Rfx to generate a diff and apply patches. I have been looking into how Rfx structures everything, and what I understood is that Rfx needs to produce a The patch in Sourceror is merely a map consisting of a range(start and end positions with the same format as the one used in the AST metadata), and a Did I get it right? |
That is correct (although technically you don't even need to navigate the AST to schedule patches, for example you could just want to check the comments which are stored separately). Rest replied in discussion |
Yes. My first implementation was to shell out to linux diff/patch utilities. Would like to migrate to a pure-elixir solution. |
This is a follow up of #13 (comment)
Sourceror.patch_string/2
is able to apply textual patches to multiple ranges of text, but it doesn't check for overlapping ranges, meaning the user has to generate a list of non overlapping patches first, or use some other approach if multiple patches are needed.It would be useful if Sourceror was able to know which nodes were changed during a traversal, such that a second traversal could be performed to collect the old ranges of the modified nodes, and then create a list of non overlapping patches.
For example, if we had some code like this:
And a traversal changes the tree such that
String.to_atom
gets converted toString.to_existing_atom
andunless not allowed?
gets converted toif allowed?
, we would have two overlapping patches, since the inner change would override changes to the outer one and viceversa, depending on the order they are applied.However, if changed nodes are marked as such, it becomes trivial to generate a single patch that does both changes but affects only the biggest range, namely the outer one.
We can leverage the
Sourceror.prewalk
/postwalk
functions for this: we keep a copy of a node before visiting it, and compare it with the node after being visited. If the nodes are different, then the node was changed and we add asourceror: [changed: %{range: original_node_range}]
field to the node's metadata.If instead of pre/postwalk we're using the Zipper API, then functions that change a node should add the same metadata to the node. On deletion, the marker must be added to the parent node metadata before going a step back. On addition of new nodes, the marker must be added to the parent node as well. Users are encouraged to modify nodes through the Zipper API instead of replacing them directly in the zipper.
In both types of traversal, if the node does not have metadata(because we are in a leaf or in a literal for the case of the regular ast), the marker metadata must be added to the parent node metadata if possible.
If the tree does not have the line and column information necessary to generate the ranges, then it makes no sense to try to generate a patch, so in these cases nothing should be done. This is inconsistent behavior, but really I don't see the point of marking a node as changed if we can't make anything useful with that information.
After all the changed nodes are marked, a
Sourceror.patches_from_ast/1
can take the tree, traverse it, and produce patches from the node and the old range. After we get the full list of ranges, if we find overlapping ranges we just discard the ones with a smaller range, as they are already contained by a bigger one.This way we can get a list of patches suitable for use with
Sourceror.patch_string/2
The text was updated successfully, but these errors were encountered: