igzip provides the fastest zlib/gzip-compatible compression/decompression in x86 CPU around the world so far (Oct 27, 2021), which is a submodule of Intel(R) Intelligent Storage Acceleration Library (ISA-L) optimized by many low-level magics including assembly language and AVX512 instructions, to gain best performance especially for Intel(R) x86 platform. Here's the brief intro:
- Supports RFC 1951 DEFLATE standard like canonical Zlib.
- 4 levels of compression, which affect operation performance and compression ratio.
- Multi-threading for compression, maximum 8 threads could be used.
- Optimized by low-level instructions to meet performance-critical scenarios.
For more details of Zlib solutions of ISA-L, please see here: Zlib Solutions of Intel(R) ISA-L and Intel(R) IPP.
To provide out-of-the-box compression/decompression functions, we proposed the igzip wrapper, which supports the direct transformation between C-style string and gzip file, based on the awesome ISA-L.
/* igzip inflate wrapper */
int decompress_file(const char *infile_name, unsigned char *output_string, size_t *output_length);
/* igzip deflate wrapper */
int compress_file(unsigned char *input_string, size_t input_length, const char *outfile_name, int compress_level, int thread_num);
Current loose coupling structure is easy to customize and add new features like streaming inflate or deflate, feel free to copy paste to adapt it to your design!
- CMake v3.2 or later
- ISA-L v2.30.0 or later
- pthreads
git clone [email protected]:ueqri/igzip-wrapper.git
mkdir -p igzip-wrapper/build
cd igzip-wrapper/build
cmake ..
make -j
Note: the multi-threading support for deflating (i.e. compression) is enabled by default, if you want to build single thread version, please add the option like cmake -DMULTI_THREADED_DEFLATE=OFF ..
instead. As for inflating, only single thread is supported restricted by the nature of gzip format.
- Copy
igzip_wrapper.h
andlibigzipwrap.so
to your program. - Include the wrapper header in the C/C++ source.
- Link the library when building the program.
If you want to build tests for inflate & deflate APIs and see the time cost in your machine, replace the previous cmake ..
with the following command.
cmake -DBUILD_TEST=ON ..
Usage of the test executables in build
directory:
# Test decompression with check file
./inflate <path-to-source-gzip-file> <path-to-uncompressed-check-file>
# Test compression and use inflate API to check
./deflate <path-to-uncompressed-source> <path-to-output-gzip-file>
A series of comprehensive benchmarks were done by Ruben Vorderman (thanks @rhpvorderman) of Python community.
Details of the benchmarks
The system was based on Ryzen 5 3600 with 2x16GB DDR4-3200 memory, and running Debian 10.
All benchmarks were performed on a tmpfs which lives in memory to prevent I/O bottlenecks, and using hyperfine for better analysis.
The test file was a 5 million read FASTQ file of 1.6 GB . These type of files are common in bioinformatics at 100+ GB sizes so are a good real-world benchmark.
Also benchmarked pigz on one thread as well, as it implements zlib but in a faster way than gzip. Zstd was benchmarked as a comparison.
Versions:
gzip 1.9 (provided by debian)
pigz 2.4 (provided by debian)
igzip 2.25.0 (provided by debian)
libdeflate-gzip 1.6 (compiled by conda-build with the recipe here: https://github.com/conda-forge/libdeflate-feedstock/pull/4)
zstd 1.3.8 (provided by debian)
Compression: By default level 1 is chosen for all compression benchmarks. Time is average over 10 runs.
COMPRESSION
program time size memory
gzip 23.5 seconds 657M 1.5M
pigz (one thread) 22.2 seconds 658M 2.4M
libdeflate-gzip 10.1 seconds 623M 1.6G (reads entire file in memory)
igzip 4.6 seconds 620M 3.5M
zstd (to .zst) 6.1 seconds 584M 12.1M
Decompression: All programs decompressed the file created using gzip -1. (Even zstd which can also decompress gzip).
DECOMPRESSION
program time memory
gzip 10.5 seconds 744K
pigz (one-thread) 6.7 seconds 1.2M
libdeflate-gzip 3.6 seconds 2.2G (reads in mem before writing)
igzip 3.3 seconds 3.6M
zstd (from .gz) 6.4 seconds 2.2M
zstd (from .zst) 2.3 seconds 3.1M
As shown from the above benchmarks, using Intel's Storage Acceleration Libraries may improve performance quite substantially. Offering very fast compression and decompression. This gets igzip in the zstd ballpark in terms of speed while still offering backwards compatibility with gzip.
From my side, after controlling for other variables, I found the decompression speedup of igzip(8 threads, v2.30.1) is ~2x faster than Zstd(v1.5.0), and ~4x faster than gzip(v1.11). And the optimization of compression is even much higher.
Include much faster DEFLATE implementations ISA-L in Python's gzip and zlib libraries
Storage acceleration with ISA-L (From Page 9)