Skip to content

Conversation

@ppkarwasz
Copy link
Contributor

This PR:

  • Fixes a bug in the benchmark initialization. The decodedData array was not the decoded version of encodedData.
  • Tries to figure out the best baseline. I suspect that the simplest version:
    blackhole.consume(new String(decodedData[i].getBytes(UTF_8), UTF_8));
    could be super-optimized by the compiler to something like:
    blackhole.consume(decodedData[i]);
  • Adds a benchmark for the toLowerCase() method.

The current benchmark results are:

Benchmark                               (nonAsciiProb)  Mode  Cnt    Score    Error  Units
PercentEncodingBenchmark.baseline                    0  avgt    5   37.273 ±  0.128  us/op
PercentEncodingBenchmark.baseline                  0.1  avgt    5   36.887 ±  0.185  us/op
PercentEncodingBenchmark.baseline                  0.5  avgt    5   36.418 ±  0.548  us/op
PercentEncodingBenchmark.percentDecode               0  avgt    5  158.208 ±  3.255  us/op
PercentEncodingBenchmark.percentDecode             0.1  avgt    5  155.733 ± 29.010  us/op
PercentEncodingBenchmark.percentDecode             0.5  avgt    5  150.590 ± 27.857  us/op
PercentEncodingBenchmark.percentEncode               0  avgt    5  886.282 ± 57.666  us/op
PercentEncodingBenchmark.percentEncode             0.1  avgt    5  879.393 ± 49.545  us/op
PercentEncodingBenchmark.percentEncode             0.5  avgt    5  885.652 ± 57.928  us/op
PercentEncodingBenchmark.toLowerCase                 0  avgt    5  104.868 ±  0.722  us/op
PercentEncodingBenchmark.toLowerCase               0.1  avgt    5  107.073 ±  6.652  us/op
PercentEncodingBenchmark.toLowerCase               0.5  avgt    5  104.674 ±  5.371  us/op
PercentEncodingBenchmark.toLowerCaseJre              0  avgt    5  735.700 ± 25.917  us/op
PercentEncodingBenchmark.toLowerCaseJre            0.1  avgt    5  719.781 ± 36.632  us/op
PercentEncodingBenchmark.toLowerCaseJre            0.5  avgt    5  721.976 ± 31.459  us/op

Fixes a bug in the benchmark initialization and adds a `toLowerCase` benchmark.
Copy link
Collaborator

@jeremylong jeremylong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeremylong jeremylong merged commit 062af5f into package-url:master Mar 23, 2025
3 checks passed
@ppkarwasz ppkarwasz deleted the feat/improve-benchmark branch March 23, 2025 10:37
@ppkarwasz
Copy link
Contributor Author

The reason the results for different values of nonAsciiProb look alike is that: there is another bug in the benchmark.

The decodedData and encodedData need to be initialized in a @Setup method, otherwise they'll use the uninitialized value of nonAsciiProbe. 😭 It's incredible how many bugs I could fit in those 4 lines.

ppkarwasz added a commit to ppkarwasz/packageurl-java that referenced this pull request Mar 23, 2025
Fixes a bug in the benchmark initialization and adds a `toLowerCase` benchmark.
jeremylong pushed a commit that referenced this pull request Mar 23, 2025
* feat: Improve benchmark (#222)

Fixes a bug in the benchmark initialization and adds a `toLowerCase` benchmark.

* fix: Benchmark initialization

The benchmark **must** be initialized in a `@Setup` method, otherwise `nonAsciiProb` will always be `0.0`.

* fix: Improve encoding/decoding performance for ASCII strings

Since strings that don't require **any** percent encoding are in practice the rule, the encoding/decoding code should be optimized for this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants