Skip to content

Conversation

@onethumb
Copy link

@onethumb onethumb commented Dec 23, 2025

The crc-fast crate is a drop-in replacement that delivers considerably higher performance (>100GiB/s on latest x86_64 Intel, >50GiB/s on latest x86_64 AMD and aarch64 AWS Graviton, including similar gains for much older x86, x86_64, and aarch64 processors) than crc32fast.

It has a fast, safe, table-based software fallback for non-accelerated systems (PowerPC, RISC-V, etc).

Bumps rust-version to 1.89.0 since that's when the AVX512 intrinsics stabilized, which is still within this project's MSRV (stable is currently 1.92.0). AVX512 provided a pretty dramatic boost for modern x86_64 CPUs.

Tasks

  • benchmarking
    • be sure to try 'many small objects' to test reset/init performance. Let's also compare state/size on the stack.
  • feature toggle
  • fix CI

Tasks for Byron

  • review crc-fast to see if it could be a replacement for crc32-fast, particularly if there is a difference regarding unsafe.

The crc-fast crate is a drop-in replacement that delivers considerably
higher performance (>100GiB/s on latest x86_64 Intel, >50GiB/s on latest
x86_64 AMD and aarch64 AWS Graviton, including similar gains for much
older x86, x86_64, and aarch64 processors).

It has a fast, safe, table-based software fallback for non-accelerated
systems.
1.89.0 is still within this project’s MSRV (stable is currently 1.92.0).
It’s the release that stabilized the AVX512 intrinsics, which help
supply many (but not all) of these impressive performance improvements
on x86_64 CPUs.
@jongiddy
Copy link
Contributor

This adds a significant number of dependencies for the flate2 crate. Along with the MSRV increase, I think that this should only be introduced behind a feature flag. It would also be good to see some evidence that this change makes a significant difference to inflate/deflate speed.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for conducting this experiment, it's much appreciated!

However, due to the MSRV increase which can't be feature toggled, I don't think it's anything we could do without a major version bump and without massive evidence that this would make a huge difference in the ecosystem.

But let's work on that.
I am putting it into Request changes so it can't be accidentally merged while we figure out the details.

And I agree with @jongiddy that this would have to go behind a feature toggle to control the added dependencies.
Since we can't really release with it as long as the rust-version is specified, I don't really know where this PR can go unless we wait until we naturally reach that MSRV, or until we have an incredibly good reason to raise the MSRV in a breaking release just to allow the crc_fast crate behind a feature toggle.

@jongiddy
Copy link
Contributor

@Byron The rust-version only specifies the minimum version for the crate with default features. Some of the existing features, including the recommended zlib-rs feature, require a higher version of Rust. So this could be merged behind a feature flag while keeping the current rust-version.

@fintelia
Copy link
Contributor

The crc32fast crate implements the single CRC algorithm version that this crate needs while crc-fast implements a whole bunch (and seems to be faster). How feasible would it be to extract the improved implementation and upstream it into crc32fast?

Also, the README in this repository describes a pretty aggressive MSRV policy of only supporting the current and immediately previous stable Rust compilers. That might not match the actual intent?

@jongiddy
Copy link
Contributor

Yeah, the MSRV policy and the rust_version sound contrary, but it is difficult to specify the actual intent in a single Cargo-enforced value. See #425 (comment)

@Byron
Copy link
Member

Byron commented Dec 29, 2025

Thanks everyone for chiming in.

It seems this PR needs a justification by providing a repeatable benchmark with both crates. To facilitate that, it would be easiest to add the new crate behind a feature toggle.

When both are in place, and once there is interest by consumers of flate2, we should be able to merge this PR or work on finding a way to get the improved algorithm into the existing CRC crate.

Does that sound feasible, or am I missing something?

Reduces dependency tree (removes ‘crc’, ‘crc-catalog’, and’
rustversion’).
@onethumb
Copy link
Author

Thanks for the discussion.

I just published 1.10.0 which reduces the dependency tree, thanks for the feedback.

Fair point on benchmarking, I've been using it in some heavy CRC projects, but it's possible the impact might not be as large with this crate. I'll see if I can find some time to benchmark with / without and see. 👍

@Byron Byron marked this pull request as draft December 31, 2025 20:00
@Byron
Copy link
Member

Byron commented Dec 31, 2025

I turned it back to draft for now, and would appreciate a task list in the PR itself to see where it's at.
Thanks again.

PS: I just now realised that you are the author of the crc-fast crate, which would have been good to read as a disclaimer. After all, the PR isn't motivated by performance wins for flate2 (or there would be benchmarks), but to promote a crate of personal interest. Let's be extra-diligent then, but also try to genuinely make the ecosystem better in the vein of zlib-rs.

@onethumb
Copy link
Author

onethumb commented Dec 31, 2025

Oh, sorry, fair point. I hadn't even considered the disclaimer.

To be clear, I'm not really motivated by promoting the crate for personal interests, I'm motivated by the increased performance gains (and reduced costs and energy consumption) that we've seen in other projects, and would like to share the wins.

If it's not a material performance win for flate2, it'd be silly to bother, and the benchmarking is a fair point. The wins have been so big on other crates, it didn't occur to me to benchmark this one separately. Most of those other projects spend a significant amount of time checksumming data (photos, videos, S3 objects, etc), but I can (now) easily imagine that flate2 may not spend nearly as high of a percentage. I'll find out.

Thanks for the conversation. This is my first real Rust project, so I'm still learning the ins and outs of the community, best practices, etc. Sorry again for not mentioning that I was the author.

@Byron
Copy link
Member

Byron commented Jan 1, 2026

Thanks!

Yes, let's see if this crate can make a difference. Personally, I am ignorant enough to not even know where CRC hashes are used here. It will be interesting.

Please note that I added some checkboxes to the PR itself to keep track of its state more easily - do feel free to edit at will.
A goal could be to see if the performance gains here can be worth it for common operations, and if so, make it available behind a feature toggle for community testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants