Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treesitter: ts_match_limit of 256 is not always enough #26325

Open
Danielkonge opened this issue Nov 30, 2023 · 3 comments · May be fixed by #29133
Open

Treesitter: ts_match_limit of 256 is not always enough #26325

Danielkonge opened this issue Nov 30, 2023 · 3 comments · May be fixed by #29133
Labels
bug issues reporting wrong behavior treesitter

Comments

@Danielkonge
Copy link
Contributor

Danielkonge commented Nov 30, 2023

Problem

I have run into a problem with highlighting in InspectTree caused by ts_match_limit being "only" 256.

The bug shows up when I highlight the tree based on the nesting level and the branch contains a large subtree.

Screenshot 2023-11-30 at 14 50 34

I tried to build with no change to the code other than

- ts_query_cursor_set_match_limit(cursor, 256);
+ ts_query_cursor_set_match_limit(cursor, 1024);

in neovim/src/nvim/lua/treesitter.c and that fixes the issue:

Screenshot 2023-11-30 at 14 53 42

Note: This is mainly a problem in the languages that build a tree all within some top level node. E.g. the pictures are from the README.md of neovim, where the whole tree is inside a top level section.

(Thank you @lucario387 for helping me figure out what causes the bug.)

Steps to reproduce

I think this is already a known issue, see:

I can include exact steps of reproduction if you want? But I already narrowed down the exact line of code causing this above, so I don't think it should be necessary?

Expected behavior

I can see in the discussion in #22055 that ts_match_limit is set relatively low for performance reasons, so I am not sure if simply raising the limit again is a good solution.

In #22055 it was mentioned that it might be better to set a timeout for the query itself and raise the limit, so maybe that is the best solution?

Either way I am mainly creating this issue to have an open issue about finding a solution to this, since #22055 is closed.

Neovim version (nvim -v)

NVIM v0.10.0-dev-1721+g01b91deec-Homebrew

Vim (not Nvim) behaves the same?

N/A

Operating system/version

macOS 14.1.1

Terminal name/version

wezterm 20231030-074559-75909682

$TERM environment variable

wezterm

Installation

tested repo and homebrew

@wookayin
Copy link
Member

wookayin commented Dec 16, 2023

Cross-posted with #26563

I once also had a similar need, although not significantly as I settled down with a different approach & workaround. The problem is that very long text with many TS nodes may not be fully captured with a correct query.

An example -- I was trying to fold "luadocs", a maximally consecutive group of comment lines to workaround the current issue where (node)+ quantifiers do not work properly (#17060).

Custom query file (lua/folds.scm)
;; extends

; Fold consecutive luadoc comments
 (
  (_) @_non_comment (#not-has-type? @_non_comment comment)
  . (comment) @_start
  . (comment)*
  . (comment) @_end
  . [(function_declaration) (assignment_statement) (variable_declaration)]
  (#make-range! "fold" @_start @_end)
 )

On a example file $VIMRUNTIME/runtime/lua/vim/lsp.lua

(1) with ts_match_limit 256:

image
  • Line 725-805 (~81 lines) are not captured as @fold because this is too long, probably because of treesitter injection of luadoc. See the signcolumn for evaluated foldexpr. All other luadoc comments that are relatively short can be correctly captured.
(2) with ts_match_limit 1024: image
  • Now Line 725-805 can be folded.

I know the above custom query implementation is kind of crazy hacky and not quite efficient, it'd be better to be addressed by improving the parser so that it can have an appropriate hierarchy and groupings of consecutive lines of comments. So I ended up with an alternative foldexpr implementation.

The downside is performance impact: it's slow. The ts_match_limit value of 1024 makes parsing noticeably slow; neovim will freeze for a few seconds while parsing the buffer (~2500 lines).

@boydkelly
Copy link

Is this tree-sitter-grammars/tree-sitter-yaml#9 perhaps related?

@wookayin
Copy link
Member

wookayin commented May 28, 2024

Probably not, because the bug you show is more like a parser fails to parse a very long document correctly. This issue is about query (which should work on a "valid" syntax tree upon successful parsing), so I don't think it's related.

@Danielkonge Danielkonge linked a pull request Jun 1, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issues reporting wrong behavior treesitter
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants