Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-sitter rolling fixes, 1.120 edition #1062

Merged
merged 8 commits into from
Aug 16, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ parser: 'tree-sitter-hyperlink'

injectionRegex: 'hyperlink'
treeSitter:
parserSource: 'github:savetheclocktower/tree-sitter-hyperlink#04c3a667ba432236578ac99bbacd0412f88d6fac'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 (re: 8090558)

grammar: 'ts/tree-sitter-hyperlink.wasm'
highlightsQuery: 'ts/highlights.scm'
Binary file not shown.
24 changes: 20 additions & 4 deletions packages/language-javascript/grammars/tree-sitter/indents.scm
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,32 @@
(#is? test.last true))
(#set! indent.matchIndentOf parent.startPosition))

; By default, `case` and `default` need to be indented one level more than their containing
; `switch`.
([(switch_case "case" @match) (switch_default "default" @match)]
; By default, `case` and `default` need to be indented one level more than
; their containing `switch`.
([
(switch_case "case" @match)
(switch_default "default" @match)
; We include this one (and check for a switch_statment ancestor) to handle
; a commonly encountered error state when the user is in the middle of typing
; a switch statement.
(ERROR "case" @match)
]
(#is? test.descendantOfType switch_statement)
(#set! indent.matchIndentOf parent.parent.startPosition)
(#set! indent.offsetIndent 1)
(#is-not? test.config "language-javascript.indentation.alignCaseWithSwitch"))

; When this config setting is enabled, `case` and `default` need to be indented
; to match their containing `switch`.
([(switch_case "case" @match) (switch_default "default" @match)]
([
(switch_case "case" @match)
(switch_default "default" @match)
; We include this one (and check for a switch_statment ancestor) to handle
; a commonly encountered error state when the user is in the middle of typing
; a switch statement.
(ERROR "case" @match)
]
(#is? test.descendantOfType switch_statement)
(#set! indent.matchIndentOf parent.parent.startPosition)
(#set! indent.offsetIndent 0)
(#is? test.config "language-javascript.indentation.alignCaseWithSwitch"))
Expand Down
3 changes: 3 additions & 0 deletions packages/language-typescript/grammars/common/highlights.scm
Original file line number Diff line number Diff line change
Expand Up @@ -841,6 +841,9 @@
"new" @keyword.operator.new._LANG_

"=" @keyword.operator.assignment._LANG_

["&" "|" "<<" ">>" ">>>" "~" "^"] @keyword.operator.bitwise.js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: af133b0) Was going to ask if we should include bitwise assignment operators somehow, but I see they are already there later in the file...

I guess that's the part we did already. Heh.

Well, good to get these as well!

(For reference, if we want to check if we've got all of these, though I'm not requesting that as a reviewer for the present PR. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators)


(non_null_expression "!" @keyword.operator.non-null._LANG_)
(unary_expression "!" @keyword.operator.unary._LANG_)

Expand Down
20 changes: 18 additions & 2 deletions packages/language-typescript/grammars/common/indents.scm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re 11e2f75)

handle a commonly encountered error state when the user is in the middle of typing a switch statement.

Sounds reasonable to me!

Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,31 @@

; By default, `case` and `default` need to be indented one level more than their containing
; `switch`.
([(switch_case "case" @match) (switch_default "default" @match)]
([
(switch_case "case" @match)
(switch_default "default" @match)
; We include this one (and check for a switch_statment ancestor) to handle
; a commonly encountered error state when the user is in the middle of typing
; a switch statement.
(ERROR "case" @match)
]
(#is? test.descendantOfType switch_statement)
(#set! indent.matchIndentOf parent.parent.startPosition)
(#set! indent.offsetIndent 1)
(#is-not? test.config "language-typescript.indentation.alignCaseWithSwitch"))

; When this config setting is enabled, `case` and `default` need to be indented
; to match their containing `switch`.

([(switch_case "case" @match) (switch_default "default" @match)]
([
(switch_case "case" @match)
(switch_default "default" @match)
; We include this one (and check for a switch_statment ancestor) to handle
; a commonly encountered error state when the user is in the middle of typing
; a switch statement.
(ERROR "case" @match)
]
(#is? test.descendantOfType switch_statement)
(#set! indent.matchIndentOf parent.parent.startPosition)
(#set! indent.offsetIndent 0)
(#is? test.config "language-typescript.indentation.alignCaseWithSwitch"))
Expand Down
195 changes: 195 additions & 0 deletions spec/wasm-tree-sitter-language-mode-spec.js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re 4388bce)

Specs are appreciated! 👍

Original file line number Diff line number Diff line change
Expand Up @@ -561,6 +561,59 @@ describe('WASMTreeSitterLanguageMode', () => {
]);
});

describe('when a highlighting query changes after load', () => {
it('updates the highlighting to reflect the new content', async () => {
jasmine.useRealClock();
const grammar = new WASMTreeSitterGrammar(atom.grammars, jsGrammarPath, jsConfig);

await grammar.setQueryForTest('highlightsQuery', scm`
(identifier) @variable
`);

buffer.setText('abc;');

const languageMode = new WASMTreeSitterLanguageMode({
buffer,
grammar
});
buffer.setLanguageMode(languageMode);
await languageMode.ready;
await wait(0);

expectTokensToEqual(editor, [
[
{ text: 'abc', scopes: ['variable'] },
{ text: ';', scopes: [] }
]
]);

// Set up a promise that resolves when highlighting updates after a
// query change.
let highlightingDidUpdate = new Promise((resolve) => {
let disposable = languageMode.onDidChangeHighlighting(
() => {
disposable.dispose();
resolve();
}
)
});

// Change the highlighting query.
await grammar.setQueryForTest('highlightsQuery', scm`
(identifier) @constant
`);
await highlightingDidUpdate;

// The language mode should automatically reload the query.
expectTokensToEqual(editor, [
[
{ text: 'abc', scopes: ['constant'] },
{ text: ';', scopes: [] }
]
]);
});
});

// TODO: Ignoring these specs because web-tree-sitter doesn't seem to do
// async. We can rehabilitate them if we ever figure it out.
xdescribe('when the buffer changes during a parse', () => {
Expand Down Expand Up @@ -1750,6 +1803,53 @@ describe('WASMTreeSitterLanguageMode', () => {

expect(Array.from(map.values())).toEqual([0, 1, 1, 2, 1, 0]);
});

it('works correctly when straddling an injection boundary, even in the presence of whitespace', async () => {
const jsGrammar = new WASMTreeSitterGrammar(atom.grammars, jsGrammarPath, jsConfig);

jsGrammar.addInjectionPoint(HTML_TEMPLATE_LITERAL_INJECTION_POINT);

const htmlGrammar = new WASMTreeSitterGrammar(
atom.grammars,
htmlGrammarPath,
htmlConfig
);

htmlGrammar.addInjectionPoint(SCRIPT_TAG_INJECTION_POINT);

atom.grammars.addGrammar(jsGrammar);
atom.grammars.addGrammar(htmlGrammar);

// This is just like the test above, except that we're indented a bit.
// Now the edge of the injection isn't at the beginning of the line; it's
// at the beginning of the first _text_ on the line.
buffer.setText(dedent`
<html>
<head>
<script>
let foo;
if (foo) {
debug(true);
}
</script>
</head>
</html>
`);

const languageMode = new WASMTreeSitterLanguageMode({
grammar: htmlGrammar,
buffer,
config: atom.config,
grammars: atom.grammars
});

buffer.setLanguageMode(languageMode);
await languageMode.ready;

let map = languageMode.suggestedIndentForBufferRows(0, 9, editor.getTabLength());

expect(Array.from(map.values())).toEqual([0, 1, 2, 3, 3, 4, 3, 2, 1, 0]);
})
});

describe('folding', () => {
Expand Down Expand Up @@ -2208,6 +2308,101 @@ describe('WASMTreeSitterLanguageMode', () => {
`);
});

it('does not enumerate redundant folds', async () => {
const grammar = new WASMTreeSitterGrammar(atom.grammars, jsGrammarPath, jsConfig);

await grammar.setQueryForTest('foldsQuery', `
(statement_block) @fold
(object) @fold
`);

// This odd way of formatting code produces a scenario where two folds
// would start on the same line. The second of the two folds would never
// be seen when toggling the fold on that line, so we shouldn't treat it
// as a valid fold for any other purpose.
buffer.setText(dedent`
if (foo) {results.push({
bar: 'baz'
})}
`);

const languageMode = new WASMTreeSitterLanguageMode({ grammar, buffer });
buffer.setLanguageMode(languageMode);
await languageMode.ready;

let ranges = languageMode.getFoldableRanges();
expect(ranges.length).toBe(1);
})

it('is not flummoxed by redundant folds when performing foldAllAtIndentLevel', async () => {
const grammar = new WASMTreeSitterGrammar(atom.grammars, jsGrammarPath, jsConfig);

await grammar.setQueryForTest('foldsQuery', `
(statement_block) @fold
(object) @fold
`);

buffer.setText(dedent`
function foo() {
if (true) {
if (foo) {results.push({
bar: 'baz'
})}
}
}

function bar() {
if (false) {
// TODO
}
}
`);

const languageMode = new WASMTreeSitterLanguageMode({ grammar, buffer });
buffer.setLanguageMode(languageMode);
await languageMode.ready;

editor.foldAllAtIndentLevel(1);
expect(getDisplayText(editor)).toBe(dedent`
function foo() {
if (true) {…}
}

function bar() {
if (false) {…}
}
`);

buffer.setText(dedent`
function foo() {
if (true) {
if (foo) {
results.push({
bar: 'baz'
})}
}
}

function bar() {
if (false) {
// TODO
}
}
`);
await languageMode.atTransactionEnd();

editor.foldAllAtIndentLevel(1);
expect(getDisplayText(editor)).toBe(dedent`
function foo() {
if (true) {…}
}

function bar() {
if (false) {…}
}
`);
})

it('can target named vs anonymous nodes as fold boundaries', async () => {
const grammar = new WASMTreeSitterGrammar(atom.grammars, rubyGrammarPath, rubyConfig);

Expand Down
47 changes: 34 additions & 13 deletions src/wasm-tree-sitter-grammar.js
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ module.exports = class WASMTreeSitterGrammar {
}).then(() => {
this._queryFilesLoaded = true;
this._loadQueryFilesPromise = null;
this.emitter.emit('did-load-query-files', this);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll pre-emptively explain this so that it's here when someone (historically @DeeDeeG) looks over it.

A feature was requested that would require a new indentation query. I wanted to give the user a workaround that they could drop into their init file, but found it surprisingly hard to pull it off.

The goal of the code in that comment is to take the content that's already in the indentation query, add an extra statement, then set that as the new indentation query for the grammar… and have it apply to all editors without having to close and open a given file again.

But:

  • You can't augment the existing indentation query until the grammar has initialized, or else you won't know what the current query is
  • You can edit the query at that point, and set it as the new query for the grammar, but that doesn't automatically propagate to open files, since their instances of WASMTreeSitterLanguageMode fetched queries from the grammar already and will have to be told to reload them

Hence this fix:

  • The new onDidLoadQueryFiles method tells you exactly when the queries have been loaded from disk
  • The setQueryForTest method (originally meant only for specs, but I'll probably rename it at some point) will now fire an event that WASMTreeSitterLanguageMode automatically subscribes to

This is still a sloppy way to customize one's Tree-sitter queries, and there's further room for improvement in the future; but this improves the status quo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(re: 3b25873)

Asking mostly because it's a bit arcane to me: Is this implementation risking being over-complicated?

If you envision some hypothetical world where a more green-field design were possible, is this how you'd want it to be done?

If not, is the limitation causing it to be done this way handed to us by Tree-sitter, or Atom/Pulsar stuff? Or is this sort of a convenient way to implement? And... any foreseen issues with this approach?

(Not sure why I'm asking all these questions, just trying to pick your brain a bit I guess. Good to know if you've considered it well before we merge it. Seems from the fact you chose to write it up preemptively that you probably have considered it. But anyway, bustin' chops today.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking mostly because it's a bit arcane to me: Is this implementation risking being over-complicated?

I don't think so? Some of the arcana about how query files are loaded could maybe be simplified, but the basic way queries work — one query file for syntax highlighting, one for indentation, one for folding… — is broadly how Neovim does things, and probably other editors.

By comparison: if you wanted to customize how a TextMate grammar did…

  • indentation: it was easy to override the existing value… but hard to augment it, since the existing value was a complex regular expression.
  • folds: largely the same as indentations.
  • syntax highlighting: no way to override without basically writing a whole new grammar.
  • symbols: no way to override whatsoever.

I can't tell which aspect of the system you're asking about, but I can explain further if you have follow-up questions.

});

return this._loadQueryFilesPromise;
Expand Down Expand Up @@ -324,7 +325,7 @@ module.exports = class WASMTreeSitterGrammar {
this.queryCache.delete(queryType);
this[queryType] = contents;
let query = await this.getQuery(queryType);
this.emitter.emit('did-change-query-file', { filePath: '', queryType });
this.emitter.emit('did-change-query', { filePath: '', queryType });
return query;
}

Expand All @@ -350,29 +351,49 @@ module.exports = class WASMTreeSitterGrammar {
this.queryCache.delete(queryType);
return;
}
this.emitter.emit('did-change-query-file', { filePath, queryType });
this.emitter.emit('did-change-query', { filePath, queryType });
});
}));
}
}

// Extended: Calls `callback` when any of this grammar's query files change.
// Extended: Calls `callback` when any of this grammar's queries change.
//
// Since a grammar’s query files won’t change during ordinary operation, this
// method’s main purpose is to aid the development of grammars by applying
// changes to query files in real time. This happens automatically when
// Pulsar is running in dev mode.
// A grammar's queries typically will not change after initial load. When
// they do, it may mean:
//
// The callback is invoked with an object argument with two keys:
// - The user is editing query files in dev mode; Pulsar will automatically
// reload queries in dev mode after changes.
// - A community package is altering a query file via an API like
// {::setQueryForTest}.
//
// * `callback`: The callback to be invoked. Takes one argument:
// * `data`: An object with keys:
// * `filePath`: The path to the query file on disk.
// * `queryType`: The type of query file, as denoted by its
// * `callback` {Function}
// * `data` {Object}
// * `filePath` {String} The path to the query file on disk.
// * `queryType` {String} The type of query file, as denoted by its
// configuration key in the grammar file. Usually one of
// `highlightsQuery`, `indentsQuery`, `foldsQuery`, or `tagsQuery`.
onDidChangeQuery(callback) {
return this.emitter.on('did-change-query', callback);
}

// Extended: Calls `callback` when any of this grammar's queries change.
//
// Alias of {::onDidChangeQuery}.
onDidChangeQueryFile(callback) {
return this.emitter.on('did-change-query-file', callback);
return this.onDidChangeQuery(callback);
}

// Extended: Calls `callback` when this grammar first loads its query files.
//
// Since a grammar may not load immedately on startup, this method makes it
// easier to hook into the query life cycle in order to modify or augment a
// grammar's default queries.
//
// * callback A function with the following argument:
// * grammar The {WASMTreeSitterGrammar} whose queries have loaded.
onDidLoadQueryFiles(callback) {
return this.emitter.on('did-load-query-files', callback);
}

activate() {
Expand Down
Loading
Loading