-
Notifications
You must be signed in to change notification settings - Fork 24
Sparse storage for growable axes #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Some notes:
|
|
The author of sparsepp suggests this: |
parallel-hashmap looks nice and would be easy to add as a dep. |
Per the discussion in boostorg/histogram#389, I can work on adding the map storage to the boost histogram bindings based on https://github.com/greg7mdp/parallel-hashmap. |
Hello everybody, what is the status of the sparse implementation in boost histogram? I would like to use it because I am dealing with many dimensions and many empty bins (approx. 95% of the bins are empty). |
@Superharz, we end up not working on this one. |
Copy of part 7 of #214
About categorical axes, it looks like the storage contains the outer product of growable categories:
returns 9290 (787 empty), while for comparison,
returns 4898 (2880 empty). Coffea histograms have a fair bit of pickling overhead compared to boost, but the sparseness catches up.
Is there a way to request sparse bin storage here?
Comment from @HDembinski
To clarify, in coffea its a mix of sparse and dense storage, where any fixed-sized axes are stored in a dense format, and each dense chunk is treated like a cell in the sparse product
The text was updated successfully, but these errors were encountered: