You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Xopen uses a mixture of zlib, isal and zlib-ng. The current default is to prefer isal, then zlib-ng and then zlib.
Below are a few benchmarks on an illumina fastq file. (Can be compressed quite well.)
Compression ratios (relative to original):
compression level
zlib
isa-l
zlib-ng
0
100.01%
25.28%
100.01%
1
23.88%
22.49%
34.99%
2
22.88%
22.47%
22.92%
3
21.66%
22.60%
21.32%
4
21.47%
20.44%
5
20.76%
19.93%
6
19.72%
19.27%
7
19.28%
19.19%
8
19.03%
19.01%
9
18.93%
18.83%
Compression times (seconds)
compression level
zlib
isa-l
zlib-ng
0
1.29
2.89
1.15
1
11.46
2.78
4.61
2
12.92
2.87
8.28
3
22.12
6.77
11.57
4
18.69
17.40
5
35.37
21.68
6
92.65
37.62
7
168.00
112.92
8
241.37
143.97
9
327.82
208.06
From this I take it is a good thing to prefer python-isal. On the lowest compression levels, it provides the best performance as well as the best compression. However this results in the following weird behaviours:
If compression level is 0 and isal is available, the file will be compressed rather than uncompressed.
Levels 1,2,3 are virtually indistuingishable in filesize
Level 3 is significantly slower whilst not differing significantly in filesize. This is because it uses avx-512, but on a non-avx512 processor, this leads to slower results. Level 2 uses avx2 by the way.
I propose that only levels 1 and 2 are forwarded to isal. Levels 3-9 should be done by zlib-ng. Level 0 should be the uncompressed block format that zlib gives by default. This way using xopen across multiple systems with different available libraries gives a relatively consistent compression experience.
The text was updated successfully, but these errors were encountered:
Xopen uses a mixture of zlib, isal and zlib-ng. The current default is to prefer isal, then zlib-ng and then zlib.
Below are a few benchmarks on an illumina fastq file. (Can be compressed quite well.)
Compression ratios (relative to original):
Compression times (seconds)
From this I take it is a good thing to prefer python-isal. On the lowest compression levels, it provides the best performance as well as the best compression. However this results in the following weird behaviours:
I propose that only levels 1 and 2 are forwarded to isal. Levels 3-9 should be done by zlib-ng. Level 0 should be the uncompressed block format that zlib gives by default. This way using xopen across multiple systems with different available libraries gives a relatively consistent compression experience.
The text was updated successfully, but these errors were encountered: