Skip to content
This repository was archived by the owner on Feb 17, 2025. It is now read-only.

Commit e538c2b

Browse files
committed
feat!: Rebuild text parsing. Remove (expr), add (sym)(str)(num)(nl)
Previously, (expr) had anonymous "str" "num" and "sym" nodes. Those are now exposed. (sym) nodes retain the anonymous symbols, like (sym "*"). Additionally, (sym next: "str") indicates the symbol is before an immediate (str), and (sym prev: "num") indicates the symbol is after a number. Add (nl) in multiline text: - (paragraph) - (fndef (description)) - (contents), in drawers, blocks, dynamic blocks, and latex_envs Add "sub" and "final" fields to (stars)
1 parent 64cfbc2 commit e538c2b

File tree

13 files changed

+71319
-80142
lines changed

13 files changed

+71319
-80142
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,5 @@ build
55
*.log
66
/examples/*/
77
/target/
8+
*.so
9+
*.o

README.md

Lines changed: 34 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,6 @@ usefully parse org files to be used in any library that uses tree-sitter
55
parsers. It is not meant to implement emacs' orgmode parser exactly, which is
66
inherently more dynamic than tree-sitter easily allows.
77

8-
## Overview
9-
10-
This section is meant to be a quick reference, not a thorough description.
11-
Refer to the tests in `corpus` for examples.
12-
13-
- Top level node: `(document)`
14-
- Document contains: `(directive)* (body)? (section)*`
15-
- Section contains: `(headline) (plan)? (property_drawer)? (body)?`
16-
- headline contains: `((stars), (item)?, (tag_list)?)`
17-
- body contains: `(element)+`
18-
- element contains: `(directive)* choose(paragraph, drawer, comment, footnote def, list, block, dynamic block, table)` or a bare `(directive)`
19-
- paragraph contains: `(expr)+`
20-
- expr contains: anonymous nodes for 'str', 'num', 'sym', and any ascii symbol that is not letters or numbers. (See top of grammar.js and queries for details)
21-
22-
Like in many regex systems, `*/+` is read as "0/1 or more", and `?` is 0 or 1.
23-
248
## Example
259

2610
```org
@@ -48,20 +32,24 @@ Parses as:
4832
(document [0, 0] - [16, 0]
4933
body: (body [0, 0] - [4, 0]
5034
directive: (directive [0, 0] - [1, 0]
51-
name: (expr [0, 2] - [0, 7])
35+
name: (expr [0, 2] - [0, 7]
36+
(str [0, 2] - [0, 7]))
5237
value: (value [0, 9] - [0, 16]
53-
(expr [0, 9] - [0, 16])))
38+
(str [0, 9] - [0, 16])))
5439
(paragraph [2, 0] - [3, 0]
55-
(expr [2, 0] - [2, 4])
56-
(expr [2, 5] - [2, 12])
57-
(expr [2, 13] - [2, 16])
58-
(expr [2, 17] - [2, 22])))
40+
(str [2, 0] - [2, 4])
41+
(sym [2, 5] - [2, 6])
42+
(str [2, 6] - [2, 12])
43+
(str [2, 13] - [2, 15])
44+
(sym [2, 15] - [2, 16])
45+
(str [2, 17] - [2, 22])
46+
(nl [2, 22] - [3, 0])))
5947
subsection: (section [4, 0] - [16, 0]
6048
headline: (headline [4, 0] - [5, 0]
6149
stars: (stars [4, 0] - [4, 1])
6250
item: (item [4, 2] - [4, 12]
63-
(expr [4, 2] - [4, 6])
64-
(expr [4, 7] - [4, 12])))
51+
(str [4, 2] - [4, 6])
52+
(str [4, 7] - [4, 12])))
6553
plan: (plan [5, 0] - [6, 0]
6654
(entry [5, 0] - [5, 16]
6755
timestamp: (timestamp [5, 0] - [5, 16]
@@ -72,50 +60,55 @@ Parses as:
7260
(listitem [7, 2] - [8, 0]
7361
bullet: (bullet [7, 2] - [7, 3])
7462
contents: (paragraph [7, 4] - [8, 0]
75-
(expr [7, 4] - [7, 8])
76-
(expr [7, 9] - [7, 10])))
63+
(str [7, 4] - [7, 8])
64+
(str [7, 9] - [7, 10])
65+
(nl [7, 10] - [8, 0])))
7766
(listitem [8, 2] - [11, 0]
7867
bullet: (bullet [8, 2] - [8, 3])
7968
checkbox: (checkbox [8, 4] - [8, 7]
80-
status: (expr [8, 5] - [8, 6]))
69+
status: (sym [8, 5] - [8, 6]))
8170
contents: (paragraph [8, 8] - [9, 0]
82-
(expr [8, 8] - [8, 12])
83-
(expr [8, 13] - [8, 14]))
71+
(str [8, 8] - [8, 12])
72+
(str [8, 13] - [8, 14])
73+
(nl [8, 14] - [9, 0]))
8474
contents: (list [9, 0] - [11, 0]
8575
(listitem [9, 4] - [10, 0]
8676
bullet: (bullet [9, 4] - [9, 5])
8777
checkbox: (checkbox [9, 6] - [9, 9])
8878
contents: (paragraph [9, 10] - [10, 0]
89-
(expr [9, 10] - [9, 14])
90-
(expr [9, 15] - [9, 16])))
79+
(str [9, 10] - [9, 14])
80+
(str [9, 15] - [9, 16])
81+
(nl [9, 16] - [10, 0])))
9182
(listitem [10, 4] - [11, 0]
9283
bullet: (bullet [10, 4] - [10, 5])
9384
checkbox: (checkbox [10, 6] - [10, 9]
94-
status: (expr [10, 7] - [10, 8]))
85+
status: (str [10, 7] - [10, 8]))
9586
contents: (paragraph [10, 10] - [11, 0]
96-
(expr [10, 10] - [10, 14])
97-
(expr [10, 15] - [10, 16])))))
87+
(str [10, 10] - [10, 14])
88+
(str [10, 15] - [10, 16])
89+
(nl [10, 16] - [11, 0])))))
9890
(listitem [11, 2] - [12, 0]
9991
bullet: (bullet [11, 2] - [11, 3])
10092
contents: (paragraph [11, 4] - [12, 0]
101-
(expr [11, 4] - [11, 8])
102-
(expr [11, 9] - [11, 10])))))
93+
(str [11, 4] - [11, 8])
94+
(str [11, 9] - [11, 10])
95+
(nl [11, 10] - [12, 0])))))
10396
subsection: (section [13, 0] - [16, 0]
10497
headline: (headline [13, 0] - [14, 0]
10598
stars: (stars [13, 0] - [13, 2])
10699
item: (item [13, 3] - [13, 13]
107-
(expr [13, 3] - [13, 13]))
100+
(str [13, 3] - [13, 13]))
108101
tags: (tag_list [13, 14] - [13, 19]
109-
tag: (tag [13, 15] - [13, 18])))
102+
tag: (tag [13, 15] - [13, 18]
103+
(str [13, 15] - [13, 18]))))
110104
body: (body [14, 0] - [16, 0]
111105
(paragraph [15, 0] - [16, 0]
112-
(expr [15, 0] - [15, 4]))))))
106+
(str [15, 0] - [15, 4])
107+
(nl [15, 4] - [16, 0]))))))
113108
```
114109

115110
## Install
116111

117-
For manual install, use `make`.
118-
119112
For neovim, using `nvim-treesitter/nvim-treesitter`, add to your configuration:
120113

121114
```lua

0 commit comments

Comments
 (0)