Add syntax table property for comment semantic tokens. #3696

JimDBh · 2022-08-26T17:58:10Z

This is important to let emacs understand that certain parts of the code (e.g. wrapped in C marcos) are not in effect and should be by-passed for things like forward-sexp.

For example, if we have the following code in emacs with c-mode:

#include <stdio.h>

#define SOME_MACRO 1

int myFun(int par) { // => This bracelet has no matching
  // ...
#if SOME_MACRO
  if (par == 0) { // => This incorrectly matches with the bracelet at the end of function
    printf("example check 0\n");
#else
  if (par == 1) { // => This should have been ignored by forward-sexp
    printf("example check 1\n");
#endif
    // ...
  }
  return 0;
} // => this bracelet matches with the wrong one

void someFun() {
  // Do something.
  myFun(1);
  return; // => beginning-of-defun won't work here
}

With this patch, these should work fine.

This is important to let emacs understand that certain parts of the code (e.g. wrapped in C marcos) are not in effect and should be by-passed for things like forward-sexp.

yyoncho · 2022-08-26T18:14:09Z

@sebastiansturm @ericdallo - willing to review this one?

yyoncho

Looks good. I am not familiar with this code, I will let the others review it. Meanwhile, a few elisp nits:

lsp-semantic-tokens.el

ericdallo

Looks good but I'd like an OK from @sebastiansturm as well 😅

sebastiansturm · 2022-08-28T19:31:28Z

sorry for being late, have been away for a week. The feature looks very useful, though I'm not sure how to test it. I've set lsp-semantic-tokens-set-comment-syntax to t, and lsp-semantic-tokens--put-comment-syntax is being called, but comment-search-forward fails to return a match (in a small C++ buffer with an ifdef-disabled part that correctly gets fontified using lsp-face-semhl-comment); searching for the regexp \s< doesn't turn up anything either. Any hints for me on how I might verify that the feature is actually working?
I'll add some preliminary comments to the review section now

sebastiansturm · 2022-08-28T19:38:01Z

lsp-semantic-tokens.el

+    (setq lsp-semantic-tokens-set-comment-syntax t))
+  (font-lock-flush))
+
+(defun lsp-semantic-tokens--get-overlapping-comments (beg end)


with the way this is used, I think it'd make sense to always return non-nil beg and end, with beg or end remaining unmodified if they don't happen to overlap with comment tokens. Maybe the function could also be renamed to something like extend-region-to-include-comment-tokens or something like that (plus the prefix, of course), in my opinion that would be slightly more intuitive

sebastiansturm · 2022-08-28T19:38:15Z

lsp-semantic-tokens.el

+    (cons prev-beg next-end)))
+
+(defun lsp-semantic-tokens--remove-comment-syntax-strict (beg end)
+  "Remove all commnet syntax strictly in (BEG END), even if they


minor typo (commnet)

sebastiansturm · 2022-08-28T19:38:54Z

lsp-semantic-tokens.el

+        (let ((beg-match (prop-match-beginning loc))
+              (end-match (prop-match-end loc)))
+          (remove-text-properties beg-match end-match '(lsp-semantic-token--comment-beg))
+          (cl-loop for i from beg-match below end-match do


is it necessary to loop here (i.e., can't beg-match and end-match simply be passed to {put,remove}-text-propert{y,ies}?

sebastiansturm · 2022-08-28T19:41:03Z

lsp-semantic-tokens.el

+                   (remove-text-properties i (1+ i) '(lsp-semantic-token--previous-syntax-table)))))
+      ;; Remove comment ends
+      (goto-char end)
+      (cl-do ((loc (text-property-search-backward


the body form of cl-do here is very similar to the one above, might make sense to extract it (or not, given that it's not very long -- I'm not sure)

sebastiansturm · 2022-08-28T19:43:35Z

lsp-semantic-tokens.el

+      (setq beg new-beg))
+    (when new-end
+      (setq end new-end)))
+  (lsp-semantic-tokens--remove-comment-syntax-strict beg end)


given that lsp--semantic-tokens-fontify removes comment syntax right at the outset, this call to lsp-semantic-tokens--remove-comment-syntax-strict might needlessly degrade performance? Or is this call necessary for some reason I don't see right now?

JimDBh · 2022-08-29T16:34:17Z

Thanks for the review! I'll address the code reviews next. As for testing the code, in my local tests, comment-search forward does find the "disabled" lines. Also paren/bracelet matchings are ignoring them. I'm using the provide snippet in this MR to test. Could you let me know what code you used for testing so I can reproduce? Thanks!

sebastiansturm · 2022-08-29T20:28:01Z

this is what I'm using (with clangd, lsp-semantic-tokens-set-comment-syntax set to t):

int main() {
  int x = 0;
#ifdef NOT_DEFINED
  int y = 1;
#endif
  return x;
}

lsp-semantic-tokens--put-comment-syntax gets called with the right begin/end arguments, no luck with comment-search-forward though

JimDBh · 2022-08-29T21:23:09Z

Thanks! However I was able to do comment-search-forward from the start of the buffer, and it ends up at the "i" character in the #ifdef macro (not sure why it wouldn't just be at the "#" sign, need to look into this a bit more)

JimDBh · 2022-08-30T17:47:19Z

Upon further testing with a very large c-mode file with MANY such marcos, this patch makes lsp-mode lag a lot. The culprit seems to be text-property-search-forward/backward. I will see if we can keep a buffer-local cache of the added comment start/end pairs, and use that in the functions. In this way it should be a faster hopefully, and we also don't have to rely on text-property-search-forward, which is not available for 26.x.

yyoncho · 2022-11-21T18:46:59Z

In this way it should be a faster hopefully, and we also don't have to rely on text-property-search-forward, which is not available for 26.x.

I think we can go ahead and drop 26.x at this point. Not sure if this helps solving your issue.

Add syntax table property for comment semantic tokens.

2a9bb3b

This is important to let emacs understand that certain parts of the code (e.g. wrapped in C marcos) are not in effect and should be by-passed for things like forward-sexp.

github-actions bot added the semantic-tokens label Aug 26, 2022

yyoncho reviewed Aug 26, 2022

View reviewed changes

lsp-semantic-tokens.el Outdated Show resolved Hide resolved

lsp-semantic-tokens.el Outdated Show resolved Hide resolved

JimDBh added 3 commits August 26, 2022 11:18

Fix: use font-lock-flush instead.

918de69

Address a few comments.

ad36e4f

Fix compile errors.

c999198

ericdallo reviewed Aug 26, 2022

View reviewed changes

Fix compile error (2nd attempt)

4b68c57

sebastiansturm reviewed Aug 28, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add syntax table property for comment semantic tokens. #3696

Add syntax table property for comment semantic tokens. #3696

JimDBh commented Aug 26, 2022

yyoncho commented Aug 26, 2022

yyoncho left a comment

ericdallo left a comment

sebastiansturm commented Aug 28, 2022

sebastiansturm Aug 28, 2022

sebastiansturm Aug 28, 2022

sebastiansturm Aug 28, 2022

sebastiansturm Aug 28, 2022

sebastiansturm Aug 28, 2022

JimDBh commented Aug 29, 2022 •

edited

Loading

sebastiansturm commented Aug 29, 2022

JimDBh commented Aug 29, 2022

JimDBh commented Aug 30, 2022

yyoncho commented Nov 21, 2022

Add syntax table property for comment semantic tokens. #3696

Are you sure you want to change the base?

Add syntax table property for comment semantic tokens. #3696

Conversation

JimDBh commented Aug 26, 2022

yyoncho commented Aug 26, 2022

yyoncho left a comment

Choose a reason for hiding this comment

ericdallo left a comment

Choose a reason for hiding this comment

sebastiansturm commented Aug 28, 2022

sebastiansturm Aug 28, 2022

Choose a reason for hiding this comment

sebastiansturm Aug 28, 2022

Choose a reason for hiding this comment

sebastiansturm Aug 28, 2022

Choose a reason for hiding this comment

sebastiansturm Aug 28, 2022

Choose a reason for hiding this comment

sebastiansturm Aug 28, 2022

Choose a reason for hiding this comment

JimDBh commented Aug 29, 2022 • edited Loading

sebastiansturm commented Aug 29, 2022

JimDBh commented Aug 29, 2022

JimDBh commented Aug 30, 2022

yyoncho commented Nov 21, 2022

JimDBh commented Aug 29, 2022 •

edited

Loading