Skip to content

Commit 7fa4a9a

Browse files
committed
New point release
Signed-off-by: Mike Lischke <[email protected]>
1 parent 1d3aa47 commit 7fa4a9a

File tree

5 files changed

+233
-21
lines changed

5 files changed

+233
-21
lines changed

ReadMe.md

+12-5
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ This package is a fork of the official ANTLR4 JavaScript runtime and has been fu
1616
- Numerous bug fixes and other changes.
1717
- Smaller node package (no test specs or other unnecessary files).
1818
- No differentiation between node and browser environments.
19+
- InterpreterDataReader implementation.
1920
- Includes the `antlr4ng-cli` tool to generate parser files compatible with this runtime. This tool uses a custom build of the ANTLR4 tool.
2021

2122
This package is a blend of the original JS implementation and antlr4ts, which is a TypeScript implementation of the ANTLR4 runtime, but was abandoned. It tries to keep the best of both worlds, while following the Java runtime as close as possible. It's a bit slower than the JS runtime, but faster than antlr4ts.
@@ -108,7 +109,7 @@ const result = visitor.visit(tree);
108109

109110
## Benchmarks
110111

111-
This runtime is monitored for performance regressions. The following table shows the results of the benchmarks run on last release:
112+
This runtime is monitored for performance regressions. The following tables show the results of the benchmarks previously run on the JS runtime and on last release of this one. Warm times were taken from 5 runs with the 2 slowest stripped off and averaged.
112113

113114
Pure JavaScript release (with type definitions):
114115

@@ -123,10 +124,10 @@ Last release (pure TypeScript):
123124

124125
| Test | Cold Run | Warm Run|
125126
| ---- | -------- | ------- |
126-
| Query Collection| 4823 ms | 372 ms |
127-
| Example File | 680 ms | 196 ms |
128-
| Large Inserts | 15176 ms | 15115 ms |
129-
| Total | 20738 ms | 15704 ms |
127+
| Query Collection| 4724 ms | 337 ms |
128+
| Example File | 672 ms | 192 ms |
129+
| Large Inserts | 15144 ms | 15039 ms |
130+
| Total | 20600 ms | 15592 ms |
130131

131132
The numbers are interesting. While the cold run for the query collection is almost 3 seconds faster with pure TS, the overall numbers in warm state are worse. So it's not a pure JS vs. TS situation, but something else must have additional influence and this will be investigated. After all the TypeScript code is ultimately transpiled to JS, so it's probably a matter of how effective the TS code is translated to JS.
132133

@@ -144,6 +145,12 @@ The example file is a copy of the largest test file in [this repository](https:/
144145

145146
## Release Notes
146147

148+
### 2.0.7
149+
150+
- Added an InterpreterDataReader implementation (copied from the vscode-antlr4 extension).
151+
- Benchmark values listed here are now computed from 5 runs, instead just one.
152+
153+
147154
### 2.0.6
148155

149156
- Optimizations in HashMap and HashSet (from Peter van Gulik). This can have dramatic speed improvements, depending on the grammar. In the unit tests this shows mostly by a faster cold start.

package.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "antlr4ng",
3-
"version": "2.0.6",
3+
"version": "2.0.7",
44
"type": "module",
55
"description": "Alternative JavaScript/TypeScript runtime for ANTLR4",
66
"main": "dist/index.cjs",

src/misc/InterpreterDataReader.ts

+153
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
/*
2+
* Copyright (c) The ANTLR Project. All rights reserved.
3+
* Use of this file is governed by the BSD 3-clause license that
4+
* can be found in the LICENSE.txt file in the project root.
5+
*/
6+
7+
import { Vocabulary } from "../Vocabulary.js";
8+
import { ATN } from "../atn/ATN.js";
9+
import { ATNDeserializer } from "../atn/ATNDeserializer.js";
10+
11+
/** The data in an interpreter file. */
12+
export interface IInterpreterData {
13+
atn: ATN;
14+
vocabulary: Vocabulary;
15+
ruleNames: string[];
16+
17+
/** Only valid for lexer grammars. Lists the defined lexer channels. */
18+
channels?: string[];
19+
20+
/** Only valid for lexer grammars. Lists the defined lexer modes. */
21+
modes?: string[];
22+
}
23+
24+
export class InterpreterDataReader {
25+
/**
26+
* The structure of the data file is very simple. Everything is line based with empty lines
27+
* separating the different parts. For lexers the layout is:
28+
* token literal names:
29+
* ...
30+
*
31+
* token symbolic names:
32+
* ...
33+
*
34+
* rule names:
35+
* ...
36+
*
37+
* channel names:
38+
* ...
39+
*
40+
* mode names:
41+
* ...
42+
*
43+
* atn:
44+
* a single line with comma separated int values, enclosed in a pair of squared brackets.
45+
*
46+
* Data for a parser does not contain channel and mode names.
47+
*/
48+
49+
public static parseInterpreterData(source: string): IInterpreterData {
50+
const ruleNames: string[] = [];
51+
const channels: string[] = [];
52+
const modes: string[] = [];
53+
54+
const literalNames: Array<string | null> = [];
55+
const symbolicNames: Array<string | null> = [];
56+
const lines = source.split("\n");
57+
let index = 0;
58+
let line = lines[index++];
59+
if (line !== "token literal names:") {
60+
throw new Error("Unexpected data entry");
61+
}
62+
63+
do {
64+
line = lines[index++];
65+
if (line.length === 0) {
66+
break;
67+
}
68+
literalNames.push(line === "null" ? null : line);
69+
} while (true);
70+
71+
line = lines[index++];
72+
if (line !== "token symbolic names:") {
73+
throw new Error("Unexpected data entry");
74+
}
75+
76+
do {
77+
line = lines[index++];
78+
if (line.length === 0) {
79+
break;
80+
}
81+
symbolicNames.push(line === "null" ? null : line);
82+
} while (true);
83+
84+
line = lines[index++];
85+
if (line !== "rule names:") {
86+
throw new Error("Unexpected data entry");
87+
}
88+
89+
do {
90+
line = lines[index++];
91+
if (line.length === 0) {
92+
break;
93+
}
94+
ruleNames.push(line);
95+
} while (true);
96+
97+
line = lines[index++];
98+
if (line === "channel names:") { // Additional lexer data.
99+
do {
100+
line = lines[index++];
101+
if (line.length === 0) {
102+
break;
103+
}
104+
channels.push(line);
105+
} while (true);
106+
107+
line = lines[index++];
108+
if (line !== "mode names:") {
109+
throw new Error("Unexpected data entry");
110+
}
111+
112+
do {
113+
line = lines[index++];
114+
if (line.length === 0) {
115+
break;
116+
}
117+
modes.push(line);
118+
} while (true);
119+
}
120+
121+
line = lines[index++];
122+
if (line !== "atn:") {
123+
throw new Error("Unexpected data entry");
124+
}
125+
126+
line = lines[index++];
127+
const elements = line.split(",");
128+
let value;
129+
130+
const serializedATN: number[] = [];
131+
for (let i = 0; i < elements.length; ++i) {
132+
const element = elements[i];
133+
if (element.startsWith("[")) {
134+
value = Number(element.substring(1).trim());
135+
} else if (element.endsWith("]")) {
136+
value = Number(element.substring(0, element.length - 1).trim());
137+
} else {
138+
value = Number(element.trim());
139+
}
140+
serializedATN[i] = value;
141+
}
142+
143+
const deserializer = new ATNDeserializer();
144+
145+
return {
146+
atn: deserializer.deserialize(serializedATN),
147+
vocabulary: new Vocabulary(literalNames, symbolicNames, []),
148+
ruleNames,
149+
channels: channels.length > 0 ? channels : undefined,
150+
modes: modes.length > 0 ? modes : undefined,
151+
};
152+
}
153+
}

src/misc/index.ts

+1
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,4 @@ export * from "./HashSet.js";
1111
export * from "./Interval.js";
1212
export * from "./IntervalSet.js";
1313
export * from "./ParseCancellationException.js";
14+
export * from "./InterpreterDataReader.js";

tests/benchmarks/run-benchmarks.ts

+66-15
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,14 @@ const splitterTest = () => {
134134
assert(r4.delimiter === "$$");
135135
};
136136

137-
const parseFiles = () => {
137+
/**
138+
* Parses a number of files and returns the time it took to parse them.
139+
*
140+
* @param logResults If true, the number of statements found in each file and the duration is logged.
141+
*
142+
* @returns The time it took to parse each file.
143+
*/
144+
const parseFiles = (logResults: boolean): number[] => {
138145
const testFiles: ITestFile[] = [
139146
// Large set of all possible query types in different combinations and versions.
140147
{ name: "./data/statements.txt", initialDelimiter: "$$" },
@@ -147,11 +154,15 @@ const parseFiles = () => {
147154
{ name: "./data/sakila-db/sakila-data.sql", initialDelimiter: ";" },
148155
];
149156

150-
testFiles.forEach((entry) => {
157+
const result: number[] = [];
158+
testFiles.forEach((entry, index) => {
151159
const sql = fs.readFileSync(path.join(path.dirname(__filename), entry.name), { encoding: "utf-8" });
152160

153161
const ranges = determineStatementRanges(sql, entry.initialDelimiter);
154-
console.log(" Found " + ranges.length + " statements in " + entry.name + ".");
162+
163+
if (logResults) {
164+
console.log(` Found ${ranges.length} statements in file ${index + 1} (${entry.name}).`);
165+
}
155166

156167
const timestamp = performance.now();
157168
ranges.forEach((range, index) => {
@@ -181,19 +192,33 @@ const parseFiles = () => {
181192
}
182193
});
183194

184-
console.log(" Parsing all statements took: " + (performance.now() - timestamp) + " ms");
195+
const duration = performance.now() - timestamp;
196+
if (logResults) {
197+
console.log(" Parsing all statements took: " + duration + " ms");
198+
}
199+
200+
result.push(duration);
185201
});
202+
203+
return result;
186204
};
187205

188-
const parserRun = (index: number) => {
206+
const parserRun = (showOutput: boolean): number[] => {
207+
let result: number[] = [];
189208
const timestamp = performance.now();
190209
try {
191-
parseFiles();
210+
result = parseFiles(showOutput);
192211
} catch (e) {
193212
console.error(e);
194213
} finally {
195-
console.log(`Parse run ${index} took ${(performance.now() - timestamp)} ms`);
214+
if (showOutput) {
215+
console.log(`Overall parse run took ${(performance.now() - timestamp)} ms`);
216+
}
196217
}
218+
219+
result.push(performance.now() - timestamp);
220+
221+
return result;
197222
};
198223

199224
console.log("\n\nStarting MySQL JS/TS benchmarks");
@@ -204,14 +229,40 @@ splitterTest();
204229

205230
console.log("Splitter tests took " + (performance.now() - timestamp) + " ms");
206231

207-
console.log("Running antlr4ng parser (cold) ...");
208-
parserRun(0);
232+
console.log("Running antlr4ng parser once (cold) ");
233+
parserRun(true);
234+
235+
process.stdout.write("Running antlr4ng parser 5 times (warm) ");
236+
237+
const times: number[][] = [];
238+
239+
// Run the parser a few times to get a better average.
240+
for (let i = 0; i < 5; ++i) {
241+
times.push(parserRun(false));
242+
process.stdout.write(".");
243+
}
244+
console.log();
245+
246+
// Transpose the result matrix.
247+
const transposed: number[][] = [];
248+
for (let i = 0; i < times[0].length; ++i) {
249+
transposed.push([]);
250+
for (const row of times) {
251+
transposed[i].push(row[i]);
252+
}
253+
}
254+
255+
// Remove the 2 slowest runs in each row and compute the average of the remaining 3.
256+
const averageTimes: number[] = [];
257+
for (const row of transposed) {
258+
const values = row.sort().slice(0, 3);
259+
averageTimes.push(values.reduce((sum, time) => { return sum + time; }, 0) / values.length);
260+
}
261+
262+
for (let i = 0; i < averageTimes.length - 1; ++i) {
263+
console.log(` File ${i + 1} took ${averageTimes[i]} ms`);
264+
}
209265

210-
console.log("Running antlr4ng parser (warm) ...");
211-
parserRun(1);
212-
//parserRun(2);
213-
//parserRun(3);
214-
//parserRun(4);
215-
//parserRun(5);
266+
console.log(`Overall parse run took ${averageTimes[averageTimes.length - 1]} ms`);
216267

217268
console.log("Done");

0 commit comments

Comments
 (0)