Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow recompute via _make_file func #1093

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft

Conversation

CBroz1
Copy link
Member

@CBroz1 CBroz1 commented Sep 6, 2024

Description

This PR adds the ability to recompute when fetching a non-existent file from v1.SpikeSortingRecording, adding some infrastructure to support expanding this tool in the future. To facilitate this, I add ...

  • A standard of a _make_file func that other tables can implement in the future.
  • NWB file specific and more general directory hashing tools
  • Tests for round-trip file deletion, then calling fetch_nwb

I have tested these hashers for...

  • All files in v1.SpikeSortingRecording
  • All directories in v0.SpikeSortingRecording
  • A subset of files in Raw (ongoing, so far 50%)

Checklist:

  • No. This PR should be accompanied by a release: (yes/no/unsure)
  • N/a. If release, I have updated the CITATION.cff
  • Yes. This PR makes edits to table definitions: (yes/no)
  • Yes. If table edits, I have included an alter snippet for release notes.
  • N/a. If this PR makes changes to position, I ran the relevant tests locally.
  • Yes. I have updated the CHANGELOG.md with PR number and description.
  • I have added/edited docs/notebooks to reflect the changes

Question

I found at least one file/directory where hashes mismatched across

  • (a) the existing version
  • (b) regenerating with this branch

What should we do about these? Do we need to restrict our deletions to only items replicable with an up-to-date Spyglass? Should I start building the infrastructure to compile this list? What do we want to do with files that don't replicate? Do we need to then do some experimentation with dependency versions? This might point us to adding conda env dumps to some part of this process

@CBroz1
Copy link
Member Author

CBroz1 commented Sep 16, 2024

I ran in to some issues here, being unable to replicate the DJ-stored file hash with seemingly matched contents. Need to do more testing on whether or not memory pointers are factored into the hash. If so, we may need to store a hash of the object computed by the specific table, and adjust the DJ-stored file hash on recreation

@edeno edeno added the enhancement New feature or request label Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants