Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added md5sums utilities functions, for a faster data loading #8

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

g-antonello
Copy link
Collaborator

Description of the usage is in the file added. In brief my idea is that every time you generate a new data version, you also run a couple of extra functions on the same parameters used to generate the file version. This should be enough to generate a md5sum object to then compare when loading.

A data loading example is also shown in the same script.

Overall, these functions could be implemented in a fancier way in a data generation pipeline, but even so they should speed data loading by 5x at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant