-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: enhance multiline (expr) parsing #35
base: main
Are you sure you want to change the base?
Conversation
5b16421
to
9f22148
Compare
88771a0
to
2fdfbb7
Compare
2fdfbb7
to
146d75c
Compare
146d75c
to
5bc1f6e
Compare
@milisims what's the state of this PR? Should I maybe give it a test or it's still WIP? |
The flattening of (expr) into (sym) (str) and (num) is definitely concrete and staying, and (sym) has the anonymous nodes for ascii symbols still, so those changes will be stable. For the other change, my goal is to facilitate querying ambiguous markup. What I did here was add the "pre" "mid" and "post" signals (empty nodes) as a part of (sym). Basically, if the symbol is immediately before, in between, or after alnum characters, the relevant anonymous node is shown. So a bold query can be simplified to However, right now So, there's three solutions.
Since it's optional and very cheap to have the signals, I'm not really interested in dropping them. The second is a barely more expensive, but I think it's simplest to use/understand. 3 gives the most control, but take for example: Currently I'm just writing some play queries to figure out a scheme that makes the most sense, if there is one. Have any thoughts? |
From the given options, I think I would prefer option 2, but option 3 would also be helpful. My current implementation works, but it's far from perfect and has a bunch of bugs and edge cases. I'm looking for a way to simplify it, and both of these options should help, but option 2 seems like I might be able to drop custom parsing completely. Whichever way you choose, even the currently implemented one, should be helpful to some extent. |
I started preparing a branch with these changes, and I ran into one issue with tags parsing.
Is parsed like this:
But if it's a single line:
It parses it properly:
Generally, it seems that adding a second line with any type of content immediately breaks parsing the tags. |
Previously, (expr) had anonymous "str" "num" and "sym" nodes. Those are now exposed. (sym) nodes retain the anonymous symbols, like (sym "*"). Additionally, (sym next: "str") indicates the symbol is before an immediate (str), and (sym prev: "num") indicates the symbol is after a number. Add (nl) in multiline text: - (paragraph) - (fndef (description)) - (contents), in drawers, blocks, dynamic blocks, and latex_envs Add "sub" and "final" fields to (stars)
Tags issue is fixed, thanks! This content:
Generates this tree:
It treats the link as a checkbox |
This PR will flatten
(expr)
in paragraph, contents, description & add(nl)
nodes.(expr)
nodes, which previously contained a sequence of anonymous"str"
,"num"
, and"sym"
nodes, are replaced with a corresponding sequence of(str)
,(num)
, and(sym)
nodes. In cases where there's one(expr)
(like block names, properties, directive names, etc.) the(expr)
node still exists (but will contain named nodes instead of anonymous nodes). For example, a block starting with#+begin_ab3
is parsed as(expr (str) (num))
.In
(paragraph)
,(item)
,(fndef (definition))
, there's now just a sequence of(str)
,(num)
,(sym)
, and(nl)
nodes. Well, no(nl)
s in(item)
.Note that
(sym ":")
still works for ascii symbols like expr previously did, so we don't need to check for ascii symbols explicitly in a predicate. This makes querying for those symbols quite fast, since they're part of the AST and don't require a predicate to check.In
(paragraph)
,(fndef (description))
,(contents)
which is in drawer, block, dynamic_block, and latex_env, newlines are now given a node:(nl)
This will resolve #31 and #26 by enabling queries for single line items:
Fixed width area:
I tried some combinations of anchors and I couldn't get this down to one pattern to match the first + every other line.
For #31, sexp diary entries will require a predicate / some effort if you want to support multiline expressions as emacs' orgmode does, but single line support is straightforward as above. For the multiline version,
lua-match?
with something like%b()
would be helpful, if you're using neovim.