Performance profiling `HashSet` #22

Codeneos · 2023-12-28T20:49:35Z

Codeneos
Dec 28, 2023

I've done some performance analysis on the JS implementation of antlr4 using nodejs as I wasn't very happy with the performance.

Parsing and lexing 2000 files took about 130-150 seconds. Some individual source files were taking up to 15-20 seconds per file.

I couldn't really figure out why the parsing was much slower than I expected, tried to tune the grammar file among some other things. Finally I hooked up a basic node.js profile using --prof and to my surprise the the get length() property of the HashSet class was on top of the list.

Looking at the implementation the HashSet class, and especially the length-property the code very inefficient.

Replacing the length-property with a simple field that is incremented when an new (unique) item is added into the set made a huge difference. Parsing and lexing time went down to just 25 seconds which is almost an 80% improvement.

The performance improvement is beyond my expectations, my grammar and the state of the source files probably is related but nonetheless I think it is worthwhile to have a look at the HashSet and HashMap implementation and if possible replace them with their native JS counterparts Set and Map.

Either way just wanted to share my observation :)

Snippet:

export class HashSet<T extends IComparable> {
    public length = 0;
    ....
    public add(value: T): T {
        const key = this.hashFunction(value);
        if (key in this.data) {
            const entries = this.data[key];
            for (const entry of entries) {
                if (this.equalsFunction(value, entry)) {
                    return entry;
                }
            }
            this.length++;
            entries.push(value);
            return value;
        } 
        this.length++;
        this.data[key] = [value];
        return value;
    }

}

mike-lischke · 2023-12-29T09:21:09Z

mike-lischke
Dec 29, 2023
Maintainer

Interesting observation Peter. The current runtime has no optimization applied so far, as it had first to work like the JS runtime before and I want to have the runtime tests in TS, before I start improving it. That change for the length property seems to be a low hanging fruit, however, and can be probably applied without a risk.

For the story HashSet vs. native Set: keep in mind that Java equality is different than JS/TS equality. In Java it's based on object equality. I have described that in the feature docs of my Java Runtime Emulation library (jree). The TS Set and Map implementation use reference identity, which is too limited for what we need in the ANTLR4 runtime (where different objects can be considered equal, if their main fields are equal).

A better replacement might be the immutable.js library (which I used in jree), as their containers use object equality too (they use the term value equality, as they consider an object as a value when it comes to comparison). Though, I'm not fully sold on their idea of immutable objects, because it makes collecting single values during a process diffcult (you have to recreate the underlying structure constantly, as demonstrated here).

0 replies

mike-lischke · 2023-12-29T09:33:33Z

mike-lischke
Dec 29, 2023
Maintainer

I just did a quick check with your proposed changes in the benchmarks and found almost no improvement:

Starting MySQL JS/TS benchmarks
Splitter tests took 40.14649999141693 ms
Running antlr4ng parser (cold) ...
    Found 945 statements in ./data/statements.txt.
    Parsing all statements took: 4946.915041923523 ms
    Found 530 statements in ./data/bitrix_queries_cut.sql.
    Parsing all statements took: 691.2358330488205 ms
    Found 57 statements in ./data/sakila-db/sakila-data.sql.
    Parsing all statements took: 14084.070625066757 ms
Parse run 0 took 19790.269541978836 ms
Running antlr4ng parser (warm) ...
    Found 945 statements in ./data/statements.txt.
    Parsing all statements took: 336.3527089357376 ms
    Found 530 statements in ./data/bitrix_queries_cut.sql.
    Parsing all statements took: 196.567999958992 ms
    Found 57 statements in ./data/sakila-db/sakila-data.sql.
    Parsing all statements took: 14235.25195801258 ms
Parse run 1 took 14787.557334065437 ms

So for these parse runs the HashSet change has not much impact. Do you have an explanation or idea why it is so heavy in your test cases?

OK, the cold start is faster, but that's not relevant here, as this is a one time task. More important is what happens once the runtime is warm.

0 replies

Codeneos · 2023-12-29T10:49:05Z

Codeneos
Dec 29, 2023
Author

Interesting, I used a Java like grammar. I'll have to validate my observations. It must be a combination of the grammar and input files that triggers the excessive calls to length.

I actually just now noticed I've used the original Antlr4 runtime in my tests. In your fork you have removed the filter from the length getter which probably already helps.

For the next few days I don't have my laptop, I'll have to revalidatie my observations next year to get some more insights.

0 replies

Codeneos · 2024-01-08T16:07:41Z

Codeneos
Jan 8, 2024
Author

I forked the antlr4ng repo today made and validate the performance before and after using the --prof option from nodejs.

 [JavaScript]:
   ticks  total  nonlib   name
    304    3.2%   13.4%  JS: *<anonymous> C:\repos\antlr4ng\dist\antlr4.cjs:600:39
    223    2.4%    9.8%  JS: *addDFAEdge C:\repos\antlr4ng\dist\antlr4.cjs:7953:13
     84    0.9%    3.7%  JS: *closure_ C:\repos\antlr4ng\dist\antlr4.cjs:7683:11
     35    0.4%    1.5%  JS: *closureCheckingStopState C:\repos\antlr4ng\dist\antlr4.cjs:7620:27
     34    0.4%    1.5%  JS: *<anonymous> C:\repos\antlr4ng\dist\antlr4.cjs:602:21
     23    0.2%    1.0%  JS: *execATN C:\repos\antlr4ng\dist\antlr4.cjs:3609:10
     20    0.2%    0.9%  JS: *updateHashCode C:\repos\antlr4ng\dist\antlr4.cjs:879:19
     20    0.2%    0.9%  JS: *updateHashCode C:\repos\antlr4ng\dist\antlr4.cjs:274:17
     20    0.2%    0.9%  JS: *execATN C:\repos\antlr4ng\dist\antlr4.cjs:6954:10
     20    0.2%    0.9%  JS: *<anonymous> C:\repos\antlr4ng\dist\antlr4.cjs:6764:7
     17    0.2%    0.7%  JS: *hashATNConfig C:\repos\antlr4ng\dist\antlr4.cjs:2803:21
     16    0.2%    0.7%  JS: *update C:\repos\antlr4ng\dist\antlr4.cjs:313:9
     15    0.2%    0.7%  JS: *optimizeConfigs C:\repos\antlr4ng\dist\antlr4.cjs:2912:18
     12    0.1%    0.5%  JS: *updateHashCode C:\repos\antlr4ng\dist\antlr4.cjs:1025:17
     11    0.1%    0.5%  JS: *precedenceTransition C:\repos\antlr4ng\dist\antlr4.cjs:7814:23
     11    0.1%    0.5%  JS: *getConflictingAltSubsets C:\repos\antlr4ng\dist\antlr4.cjs:6762:34

As you can see antlr4.cjs:600:39 which is anon-function passed to map length method and antlr4.cjs:602:21 which is the anon-function passed to the reduce method both take a lot of time. Looks like they are called

  get length() { 
    return Object.keys(this.data).map((key) => { // 600:39
      return this.data[key].length;
    }, this).reduce((accumulator, item) => { // 602:21
      return accumulator + item;
    }, 0);
  }

The culprit seems to be LexerATNSimulator.addDFAState line 667:

    304    3.2%  JS: *<anonymous> C:\repos\antlr4ng\dist\antlr4.cjs:600:39
    295   97.0%    C:\Program Files\nodejs\node.exe
    294   99.7%      JS: ^get length C:\repos\antlr4ng\dist\antlr4.cjs:599:13
    294  100.0%        JS: ^addDFAState C:\repos\antlr4ng\dist\antlr4.cjs:7989:14
    294  100.0%          JS: ^addDFAEdge C:\repos\antlr4ng\dist\antlr4.cjs:7953:13
    294  100.0%            JS: *computeTargetState C:\repos\antlr4ng\dist\antlr4.cjs:7063:21
      9    3.0%    JS: ^get length C:\repos\antlr4ng\dist\antlr4.cjs:599:13
      9  100.0%      JS: ^addDFAState C:\repos\antlr4ng\dist\antlr4.cjs:7989:14
      9  100.0%        JS: ^addDFAEdge C:\repos\antlr4ng\dist\antlr4.cjs:7953:13
      9  100.0%          JS: *computeTargetState C:\repos\antlr4ng\dist\antlr4.cjs:7063:21
      9  100.0%            JS: *execATN C:\repos\antlr4ng\dist\antlr4.cjs:6954:10

Which uses the length of the HashSet as stateNumber:

const newState = proposed;
newState.stateNumber = dfa.states.length;
configs.setReadonly(true);

Note that I had to run as CommonJS module due to my project being CommonJS without an option to switch at this moment due to a number of dependencies breaking

1 reply

Codeneos Jan 8, 2024
Author

Also made a PR with my changes to the HashMap and Set classes so you can validate them.

As the source files that I am using are proprietary I cannot share them, but I think you should be able to reproduce my findings with the Java code a grammar.

Note: I was (a bit) lazy and added CJS support and CharStream as interface in the same PR as I needed this to replace the antlr4 with antlr4ng in my library. If that is any issue I can separate them out into separate PRs 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance profiling `HashSet` #22

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Performance profiling HashSet #22

Codeneos Dec 28, 2023

Replies: 4 comments · 1 reply

mike-lischke Dec 29, 2023 Maintainer

mike-lischke Dec 29, 2023 Maintainer

Codeneos Dec 29, 2023 Author

Codeneos Jan 8, 2024 Author

Codeneos Jan 8, 2024 Author

Performance profiling `HashSet` #22

Codeneos
Dec 28, 2023

Replies: 4 comments 1 reply

mike-lischke
Dec 29, 2023
Maintainer

mike-lischke
Dec 29, 2023
Maintainer

Codeneos
Dec 29, 2023
Author

Codeneos
Jan 8, 2024
Author

Codeneos Jan 8, 2024
Author