Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-sitter rolling fixes, 1.120 edition #1062

Merged
merged 8 commits into from
Aug 16, 2024

Conversation

savetheclocktower
Copy link
Contributor

Fixes #868.

I feel lucky to be able to introduce this rolling PR so soon after the release. This first commit fixes a strange folding behavior that had been hanging around for about six months.

…when determining suggested indent.

There's a lot of nuance here about which layer gets asked the question of what the current line's indentation level should be. To make sure we're asking the right layer, we want to consider the deepest layer that is active at a given point.

But since injections rarely cover the whitespace that precedes them, we generally want to look at the beginnings and ends of the _content_ on various lines. So if the first ten characters of a line are whitespace, we should ask what the controlling layer is at column 10, not at column 0.
…which has better support for certain URLs.
(could've sworn I did this already)
…like `WASMTreeSitterGrammar::onDidLoadQueryFiles`, a callback that lets a user know when queries have been loaded and are able to be customized.

Also makes it so that `WASMTreeSitterGrammar::setQueryForTest` implicitly causes a `WASMTreeSitterLanguageMode` to reload the given query.
…when the user is typing and doesn't yet have a valid tree.
@@ -182,6 +182,7 @@ module.exports = class WASMTreeSitterGrammar {
}).then(() => {
this._queryFilesLoaded = true;
this._loadQueryFilesPromise = null;
this.emitter.emit('did-load-query-files', this);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll pre-emptively explain this so that it's here when someone (historically @DeeDeeG) looks over it.

A feature was requested that would require a new indentation query. I wanted to give the user a workaround that they could drop into their init file, but found it surprisingly hard to pull it off.

The goal of the code in that comment is to take the content that's already in the indentation query, add an extra statement, then set that as the new indentation query for the grammar… and have it apply to all editors without having to close and open a given file again.

But:

  • You can't augment the existing indentation query until the grammar has initialized, or else you won't know what the current query is
  • You can edit the query at that point, and set it as the new query for the grammar, but that doesn't automatically propagate to open files, since their instances of WASMTreeSitterLanguageMode fetched queries from the grammar already and will have to be told to reload them

Hence this fix:

  • The new onDidLoadQueryFiles method tells you exactly when the queries have been loaded from disk
  • The setQueryForTest method (originally meant only for specs, but I'll probably rename it at some point) will now fire an event that WASMTreeSitterLanguageMode automatically subscribes to

This is still a sloppy way to customize one's Tree-sitter queries, and there's further room for improvement in the future; but this improves the status quo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 3b25873)

Asking mostly because it's a bit arcane to me: Is this implementation risking being over-complicated?

If you envision some hypothetical world where a more green-field design were possible, is this how you'd want it to be done?

If not, is the limitation causing it to be done this way handed to us by Tree-sitter, or Atom/Pulsar stuff? Or is this sort of a convenient way to implement? And... any foreseen issues with this approach?

(Not sure why I'm asking all these questions, just trying to pick your brain a bit I guess. Good to know if you've considered it well before we merge it. Seems from the fact you chose to write it up preemptively that you probably have considered it. But anyway, bustin' chops today.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking mostly because it's a bit arcane to me: Is this implementation risking being over-complicated?

I don't think so? Some of the arcana about how query files are loaded could maybe be simplified, but the basic way queries work — one query file for syntax highlighting, one for indentation, one for folding… — is broadly how Neovim does things, and probably other editors.

By comparison: if you wanted to customize how a TextMate grammar did…

  • indentation: it was easy to override the existing value… but hard to augment it, since the existing value was a complex regular expression.
  • folds: largely the same as indentations.
  • syntax highlighting: no way to override without basically writing a whole new grammar.
  • symbols: no way to override whatsoever.

I can't tell which aspect of the system you're asking about, but I can explain further if you have follow-up questions.

@savetheclocktower savetheclocktower marked this pull request as ready for review August 9, 2024 17:34
Comment on lines +3349 to +3368
// Previously we were storing compiled queries directly on the language
// layer. Now we store them on a `queries` object instead.
//
// All internal usages have been changed, but packages might still try to
// find these queries at their old locations. For that reason, we'll define
// shims for backward compatibility.
get highlightsQuery() { return this.queries.highlightsQuery; }
set highlightsQuery(value) { this.queries.highlightsQuery = value; }

get indentsQuery() { return this.queries.indentsQuery; }
set indentsQuery(value) { this.queries.indentsQuery = value; }

get foldsQuery() { return this.queries.foldsQuery; }
set foldsQuery(value) { this.queries.foldsQuery = value; }

get tagsQuery() { return this.queries.tagsQuery; }
set tagsQuery(value) { this.queries.tagsQuery = value; }

get localsQuery() { return this.queries.localsQuery; }
set localsQuery(value) { this.queries.localsQuery = value; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need some explaining!

This was the result of an arbitrary choice I made about 16 months ago: to store the compiled queries as direct properties of each LanguageLayer.

  • It feels untidy.
  • It doesn't keep the queries themselves grouped together — as one might want to do if there's a reason to iterate through them.
  • This is all still private API (as is all of WASMTreeSitterLanguageMode, but if that ever changes, I want this stuff to be more organized.

So now we're keeping them on a queries object. Some packages probably expect the queries to be present at their old locations, so I've added shims to support getting and setting the current list of queries from their legacy locations.

This is not a big deal. But the best time to have done this was ~16 months ago; the second best time is now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 3b25873)

I'm going to go with "I believe you". Sounds like the correct response.

Thank you for the notes!

Copy link
Member

@confused-Techie confused-Techie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look through this, and the changes seem straightforward enough I could mostly follow lol.

But tests are happy, the documentation within the code, and comments are verbose, precise, and perfectly explain to myself and others what's going on. All the code looks perfectly valid, and my understanding of changes within JS and TypeScript made total sense.

Plus there's an emphasis on cleaning up and organizing code, while being intentionally backwards compatible. So what's not to love? This all looks great to me, and I especially love "the second best time is now" comment

Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda trusting that this works as intended, but I approve of the intent as described/stated, and I'm satisfied that I understand more or less what this is all intending to do.

So, to the extent I can offer meaningful review, and I do try, this gets a 👍 from me.

One typo found, look for the word ACTIONABLE to find it amongst my various review comments.

@@ -841,6 +841,9 @@
"new" @keyword.operator.new._LANG_

"=" @keyword.operator.assignment._LANG_

["&" "|" "<<" ">>" ">>>" "~" "^"] @keyword.operator.bitwise.js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: af133b0) Was going to ask if we should include bitwise assignment operators somehow, but I see they are already there later in the file...

I guess that's the part we did already. Heh.

Well, good to get these as well!

(For reference, if we want to check if we've got all of these, though I'm not requesting that as a reviewer for the present PR. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re 11e2f75)

handle a commonly encountered error state when the user is in the middle of typing a switch statement.

Sounds reasonable to me!

Comment on lines +3349 to +3368
// Previously we were storing compiled queries directly on the language
// layer. Now we store them on a `queries` object instead.
//
// All internal usages have been changed, but packages might still try to
// find these queries at their old locations. For that reason, we'll define
// shims for backward compatibility.
get highlightsQuery() { return this.queries.highlightsQuery; }
set highlightsQuery(value) { this.queries.highlightsQuery = value; }

get indentsQuery() { return this.queries.indentsQuery; }
set indentsQuery(value) { this.queries.indentsQuery = value; }

get foldsQuery() { return this.queries.foldsQuery; }
set foldsQuery(value) { this.queries.foldsQuery = value; }

get tagsQuery() { return this.queries.tagsQuery; }
set tagsQuery(value) { this.queries.tagsQuery = value; }

get localsQuery() { return this.queries.localsQuery; }
set localsQuery(value) { this.queries.localsQuery = value; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 3b25873)

I'm going to go with "I believe you". Sounds like the correct response.

Thank you for the notes!

@@ -182,6 +182,7 @@ module.exports = class WASMTreeSitterGrammar {
}).then(() => {
this._queryFilesLoaded = true;
this._loadQueryFilesPromise = null;
this.emitter.emit('did-load-query-files', this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 3b25873)

Asking mostly because it's a bit arcane to me: Is this implementation risking being over-complicated?

If you envision some hypothetical world where a more green-field design were possible, is this how you'd want it to be done?

If not, is the limitation causing it to be done this way handed to us by Tree-sitter, or Atom/Pulsar stuff? Or is this sort of a convenient way to implement? And... any foreseen issues with this approach?

(Not sure why I'm asking all these questions, just trying to pick your brain a bit I guess. Good to know if you've considered it well before we merge it. Seems from the fact you chose to write it up preemptively that you probably have considered it. But anyway, bustin' chops today.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Regarding ce6989e)

Seems legit! I believe the explanation, anyway! Code doesn't look too crazy, this is mostly adding specs, by diff line count. 👍 Specs are good to have!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Regarding 358cb02)

Well explained, and I read the associated issue report. Seems like a good way of handling this, effectively the outer fold range starting on a line takes precedence seems to be the practical effect. Reasonable, IMO. 👍

@@ -5,5 +5,6 @@ parser: 'tree-sitter-hyperlink'

injectionRegex: 'hyperlink'
treeSitter:
parserSource: 'github:savetheclocktower/tree-sitter-hyperlink#04c3a667ba432236578ac99bbacd0412f88d6fac'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 (re: 8090558)

this.observeQueryFileChanges();
}
// This used to be called only in dev mode. But there are other use cases
// for dynamically reloading queries, so now we observer for changes in a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACTIONABLE (though not terribly important!)

Tiny typo "observer" --> "observe" ?

Not a blocker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've already caught this one and it'll be addressed in next month's PR. :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re 4388bce)

Specs are appreciated! 👍

Comment on lines +4 to +6
function layerHasTagsQuery(layer) {
return layer.queries?.tagsQuery ?? layer.tagsQuery;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 6846b9c)

Had to brush up on optional chaining to see that this checks out. It does appear to check out as valid use of optional chaining.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is me being incredibly paranoid, since we'd only fall back to the right side of that ?? expression if someone extracted the newer version of this package and added it to an older version of Pulsar.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 16, 2024

Two review-approves and both confused and I are comfortable merging this (context: prepping 1.20.0 at the moment), so, merging!

In terms of things to possibly circle back to later: there is that one typo I found, and a question I asked for clarification / "picking your brain" purposes, but neither are blockers!!

@DeeDeeG DeeDeeG merged commit d63ab23 into pulsar-edit:master Aug 16, 2024
104 checks passed
@savetheclocktower savetheclocktower deleted the tree-sitter-august branch August 16, 2024 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tree sitter breaks code folding in certain cases
3 participants