Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another batch of Clojure enhancements #729

Merged
merged 37 commits into from
Jun 11, 2024

Conversation

mauricioszabo
Copy link
Contributor

@mauricioszabo mauricioszabo commented Sep 19, 2023

Replaces #663 because this is more complete.

That was a journey, but basically Clojure highlights some interesting things:

  1. Added a different Clojure grammar - https://github.com/mauricioszabo/tree-sitter-clojure - that adds quotes for strings (so we can add injection points on strings)
  2. Highlights some interesting cases like syntax quoting and unquoting (a screenshot with a custom CSS to highlight these scopes):
    image
  3. Metadata highlighting (again, with custom CSS):
    image

Missing:

  • Make EDN grammar behave like Clojure (most plug-ins don't work if you don't detect the filetype as Clojure)
  • Check if there are performance issues that can be solved in this new scope resolver
  • Changelog update

@mauricioszabo mauricioszabo marked this pull request as ready for review September 21, 2023 03:59
@mauricioszabo mauricioszabo changed the title Fixed Clojure's non-word chars Another batch of Clojure enhacements Sep 21, 2023
@mauricioszabo mauricioszabo changed the title Another batch of Clojure enhacements Another batch of Clojure enhancements Sep 21, 2023
@Spiker985
Copy link
Member

Looks like this is still set up for the legacy tree-sitter implementation, it would need to use the parserSource key in order to be a "new tree-sitter" grammar.

It also looks like this adds some new functions to some base files - won't those potentially break other tree-sitter grammars?

@mauricioszabo
Copy link
Contributor Author

@Spiker985 there's no legacy Clojure tree-sitter actually, so I'm unsure what you mean with this options.

As for the new options, they are just new things we support, if other grammars don't use these they won't be affected. But before everything, I need to fix these grammars (or even remove what I did) - injection points are not correct, and they are janky when editing these forms...

@savetheclocktower
Copy link
Contributor

Looks like this is still set up for the legacy tree-sitter implementation, it would need to use the parserSource key in order to be a "new tree-sitter" grammar.

parserSource is a key that doesn't have a function right now except to help the user understand where a compiled WASM file comes from, and when it might need updating. That said, it does need to be present in order to make CI happy, which is why the “Validate WASM Grammar PR Changes” check is failing.

@savetheclocktower
Copy link
Contributor

It also looks like this adds some new functions to some base files - won't those potentially break other tree-sitter grammars?

The changes introduced to core don't change how any existing features behave. We've got

  • A new scope test called ancestorTypeNearerThan (which I think I proposed a while back as a way of solving a problem that Clojure had; it makes several different tests of node ancestry, and without something like this, they're bound to conflict with one another unless you can tell which one is the closest ancestor).
  • A new argument to the content callback of atom.grammars.addInjectionPoint, which won't bother anything that isn't expecting a second argument. I didn't dig too deeply into how it's used in the PR because Clojure scares me, but including the Buffer as a second argument is fine with me, and doesn't have any real downsides that I can think of.

If the tests pass, I'm good.

@DeeDeeG
Copy link
Member

DeeDeeG commented May 8, 2024

We can try merging in the macOS CI fix from master branch if wanting a proper CI run for this PR.

// the second,third,etc argument
ancestorTypeNearerThan(node, types) {
let [target, ...rejected] = types.split(/\s+/);
rejected = new Set(rejected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're almost certainly not better off converting the rest to a Set. Maybe if you were caching these items by string value (which might not be a bad idea), but otherwise I suspect it's quicker to keep the rest as an array and do a rejected.includes(current.type) test instead.

Probably not a huge deal, but I'd suggest doing a profiling stress test here. Any predicate that loops is worth optimizing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I did profile - it indeed makes no difference, so I went with the Set because it's simpler.

What I found is that what a lot of nodes (like, 1 million tests for example) the Set implementation is faster, even when the Set contains 1 or 2 elements only. But Pulsar doesn't even supports 1 million "parent" nodes, so we're safe here :)

CHANGELOG.md Outdated Show resolved Hide resolved
mauricioszabo and others added 2 commits June 7, 2024 11:03
Co-authored-by: Andrew Dupont <[email protected]>
Co-authored-by: DeeDeeG <[email protected]>
Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More or less rubber-stamp approved from me, hopefully others on the team with more knowledge in this area are happy with the PR now as well?

If they have any feedback, I defer to them, however.

@mauricioszabo mauricioszabo merged commit 7e38c83 into master Jun 11, 2024
104 checks passed
@mauricioszabo mauricioszabo deleted the clojure-grammar-enhancements branch June 11, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants