feat: inline DecodeBit function for improved performance #55
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
With immense respect for the work done so far, I am writing this pull request to propose an optimization that would enhance the performance of the
github.com/bodgit/sevenzip
project, which relies on thelzma
module fromgithub.com/ulikunitz/xz
.Currently, I am working on the decompression of large-scale 7z archives (using lzma2) in the GB range. As part of this effort, I have noticed that the repetitive calls to the DecodeBit function within the rangeDecoder module lead to frequent memory writes and potential performance bottlenecks.
To address this issue, I propose inlining the DecodeBit function, which would reduce the overhead of function calls and make efficient use of registers for intermediate states. By doing so, we can significantly improve the decompression performance of large-scale 7z archives.
The key advantages of this optimization include:
During my local testing, I observed a notable performance improvement of approximately 19% in scenarios involving repetitive calls to DecodeBit. This suggests the potential benefits it can bring to the
github.com/bodgit/sevenzip
project, particularly when decompressing GB-sized 7z archives.before:
after:
I kindly request your review and feedback on this pull request. Thank you.