Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for custom character length alphabets #59

Merged
merged 29 commits into from
Dec 2, 2024

Conversation

anatoly-kussul
Copy link
Contributor

@anatoly-kussul anatoly-kussul commented Nov 30, 2024

goos: linux
goarch: amd64
cpu: 12th Gen Intel(R) Core(TM) i5-12400F
                         │    v4.1.0    │                new                  │
                         │    sec/op    │   sec/op     vs base                │
UUID-12                     412.4n ± 0%   326.5n ± 0%  -20.84% (p=0.000 n=10)
Encoding-12                136.00n ± 1%   43.73n ± 0%  -67.85% (p=0.000 n=10)
Decoding-12                 270.3n ± 0%   180.0n ± 1%  -33.42% (p=0.000 n=10)
NewWithAlphabet-12         8064.0n ± 0%   623.8n ± 0%  -92.27% (p=0.000 n=10)
EncodingB57_MB-12                         122.2n ± 5%
EncodingB16-12                            132.5n ± 1%
EncodingB16_MB-12                         358.6n ± 0%
DecodingB16-12                            220.4n ± 1%
DecodingB16_MB-12                         266.8n ± 2%
NewWithAlphabetB16-12                     486.5n ± 0%
NewWithAlphabetB16_MB-12                  837.2n ± 0%

                         │     B/op      │    B/op     vs base                  │
UUID-12                     40.00 ± 0%     40.00 ± 0%        ~ (p=1.000 n=10) ¹
Encoding-12                 24.00 ± 0%     24.00 ± 0%        ~ (p=1.000 n=10) ¹
Decoding-12                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=10) ¹
NewWithAlphabet-12         8545.5 ± 0%     280.0 ± 0%  -96.72% (p=0.000 n=10)
EncodingB57_MB-12                          72.00 ± 0%
EncodingB16-12                             64.00 ± 0%
EncodingB16_MB-12                          512.0 ± 0%
DecodingB16-12                             0.000 ± 0%
DecodingB16_MB-12                          0.000 ± 0%
NewWithAlphabetB16-12                      144.0 ± 0%
NewWithAlphabetB16_MB-12                   592.0 ± 0%

                         │   allocs/op   │ allocs/op   vs base                  │
UUID-12                     2.000 ± 0%     2.000 ± 0%        ~ (p=1.000 n=10) ¹
Encoding-12                 1.000 ± 0%     1.000 ± 0%        ~ (p=1.000 n=10) ¹
Decoding-12                 0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=10) ¹
NewWithAlphabet-12         19.000 ± 0%     3.000 ± 0%  -84.21% (p=0.000 n=10)
EncodingB57_MB-12                          2.000 ± 0%
EncodingB16-12                             2.000 ± 0%
EncodingB16_MB-12                          4.000 ± 0%
DecodingB16-12                             0.000 ± 0%
DecodingB16_MB-12                          0.000 ± 0%
NewWithAlphabetB16-12                      4.000 ± 0%
NewWithAlphabetB16_MB-12                   6.000 ± 0%

@anatoly-kussul
Copy link
Contributor Author

Changed alphabet.Index() to use binary search, which gains ~10% performance on Decode:

                   │     for     │                binary               │
                   │   sec/op    │   sec/op     vs base                │
Decoding-12          203.4n ± 1%   180.2n ± 1%  -11.45% (p=0.000 n=10)

Previous attempt in #51 didn't find any success, because it used sort.Search which adds lambda calls overhead.


                   │     for     │             sort.Search             │
                   │   sec/op    │   sec/op     vs base                │
Decoding-12          203.4n ± 1%   313.1n ± 1%  +53.92% (p=0.000 n=10)

@anatoly-kussul
Copy link
Contributor Author

Added some more benchmarks,
Added optimization for alphabets containing only single-byte characters.

@lithammer
Copy link
Owner

Had to copy some internal Sort's code into project, as before go1.18 sort.Slice uses reflect, which increases allocations and decreases performance

I think we can just raise the minimum version in go.mod to 1.18 instead. 1.13 has been EOL for over 4 years. And 1.18 for almost 2 years.

go 1.13

@anatoly-kussul
Copy link
Contributor Author

seems like I was a bit mistaken, and slices package was an external experimental package at go 1.18, so will have to introduce it as external dependency.
Or if it's ok, we can increase minimum version up to 1.21, slices are already built-in at that point. (1.21 EoL was 3 months ago).

And no clue why CI lint is failing, have to investigate

@anatoly-kussul
Copy link
Contributor Author

Introduced more encode optimizations, using similar technique as used in big.Int.Text().
Updated benchmark in PR description.

@lithammer
Copy link
Owner

Or if it's ok, we can increase minimum version up to 1.21, slices are already built-in at that point. (1.21 EoL was 3 months ago).

Increasing to 1.21 is fine 🙂

@lithammer lithammer self-assigned this Dec 1, 2024
README.md Show resolved Hide resolved
encoder.go Show resolved Hide resolved
encoder.go Outdated
if err != nil {
return
}
if e.alphabet.len == defaultBase { // compiler optimization using constant for default base
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, is it worth it?

Copy link
Contributor Author

@anatoly-kussul anatoly-kussul Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on Decode it's around ~11% performance difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran benchmarks on Decode again, and it seems like there was some fluke in the first run.
And it seems like removing this if actually improves performance by ~2%.
Removed it.

@anatoly-kussul
Copy link
Contributor Author

Added benchmarks for NewWithNamespace.
Increase preformance of it by removing the need to lower full string before comparison.

goos: linux
goarch: amd64
cpu: 12th Gen Intel(R) Core(TM) i5-12400F
                         │     old      │                 new                 │
                         │    sec/op    │   sec/op     vs base                │
NewWithNamespace-12        285.4n ±  1%   248.7n ± 2%  -12.89% (p=0.000 n=10)
NewWithNamespaceHttp-12    281.4n ±  1%   253.6n ± 1%   -9.88% (p=0.000 n=10)
NewWithNamespaceHttps-12   322.8n ± 20%   253.6n ± 2%  -21.41% (p=0.000 n=10)
geomean                    296.0n         252.0n       -14.87%

                         │    B/op    │    B/op     vs base                 │
NewWithNamespace-12        200.0 ± 0%   200.0 ± 0%       ~ (p=1.000 n=10)
NewWithNamespaceHttp-12    208.0 ± 0%   208.0 ± 0%       ~ (p=1.000 n=10)
NewWithNamespaceHttps-12   224.0 ± 0%   224.0 ± 0%       ~ (p=1.000 n=10)
geomean                    210.4        210.4       +0.00%

                         │ allocs/op  │ allocs/op   vs base                 │
NewWithNamespace-12        5.000 ± 0%   5.000 ± 0%       ~ (p=1.000 n=10)
NewWithNamespaceHttp-12    5.000 ± 0%   5.000 ± 0%       ~ (p=1.000 n=10)
NewWithNamespaceHttps-12   5.000 ± 0%   5.000 ± 0%       ~ (p=1.000 n=10)
geomean                    5.000        5.000       +0.00%

@lithammer lithammer merged commit 9c03ed0 into lithammer:master Dec 2, 2024
3 checks passed
@lithammer
Copy link
Owner

Thanks! Very impressive work again! 👏

@lithammer
Copy link
Owner

Are you planning on doing more of these or...?

@anatoly-kussul
Copy link
Contributor Author

Thanks! Very impressive work again! 👏

Thanks again for very quick review and approval!

Are you planning on doing more of these or...?

I think I did all I could already. =)

I just noticed that this library (we used v3) was major part of our cpu usage on production.
So I tried to improve performance of it a bit.
We only use New and NewWithNamespace, so that was my main focus in first PR.
But then I noticed, that I also can improve NewWithAlphabet by a lot, and decided to give it a try.

@lithammer
Copy link
Owner

Well I'm glad you took your time. The improvements are quite significant! Always nice to hear from people using it in production 🙂

New release: https://github.com/lithammer/shortuuid/releases/tag/v4.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic on more than 1 byte symbols Support for custom character length alphabets
2 participants