Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compress() function to decrease the file size of BinaryCIF files #674

Merged
merged 7 commits into from
Oct 16, 2024

Conversation

padix-key
Copy link
Member

When data is set in BinaryCIFFile objects (e.g via set_structure()), a simple default encoding is used which does not compress the data. Although this is fast, the resulting files can be significantly larger. This PR introduces the compress() function, which takes an existing BinaryCIF... object and creates BinaryCIF... object with size-optimized encoding.

The resulting files are significantly smaller than the uncompressed ones and even quite a bit smaller than the ones originating from the PDB. The following example shows the size for 1L2Y:

Original file from PDB:   199893 bytes
Uncompressed file:       1299713 bytes
File after `compress()`:  169448 bytes

Copy link

codspeed-hq bot commented Oct 12, 2024

CodSpeed Performance Report

Merging #674 will not alter performance

Comparing padix-key:compress (8320714) with main (231eefe)

Summary

✅ 44 untouched benchmarks

🆕 1 new benchmarks

Benchmarks breakdown

Benchmark main padix-key:compress Change
🆕 benchmark_compress N/A 724.2 ms N/A

@padix-key padix-key merged commit 241655b into biotite-dev:main Oct 16, 2024
27 checks passed
@padix-key padix-key deleted the compress branch October 23, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant