Repack Nwb Files #1003

pauladkisson · 2024-08-13T14:57:28Z

Fixes #892

Depends on

Wrap data in set_data_io with a DataChunkIterator to support overriding hdf5 dataset backend configurations hdmf-dev/hdmf#1172
added link_data --> clear_cache relationship to support repacking zarr nwbfiles hdmf-dev/hdmf-zarr#215
[Bug]: Error when pixel_mask PlaneSegmentation is exported hdmf-dev/hdmf-zarr#272
[Bug]: pixel mask gets duplicated every time it is written hdmf-dev/hdmf-zarr#278
[Bug]: compound Dtypes do not support custom chunking and compression with Zarr. hdmf-dev/hdmf#1296

src/neuroconv/tools/nwb_helpers/_configuration_models/_hdf5_dataset_io.py

temp_test.py

pauladkisson · 2024-08-14T23:53:29Z

What are you thinking about how the API should look like for how to use this repack helper?

@CodyCBakerPhD, the code now reflects my vision for the API: repack_nwbfile takes an on-disk nwbfile and export path, configures the backend, and exports the nwbfile. Users can optionally specify the template (existing or default) and any manual changes to the backend config.

The code then progresses along 2 paths:

existing: where backend info is read directly from the nwbfile
default: where default backend info is obtained for each neurodata object in the nwbfile

lmk what you think!

src/neuroconv/tools/nwb_helpers/_backend_configuration.py

...est_backend_and_dataset_configuration/test_helpers/test_get_default_backend_configuration.py

src/neuroconv/tools/nwb_helpers/_metadata_and_file_helpers.py

...est_backend_and_dataset_configuration/test_helpers/test_get_default_backend_configuration.py

src/neuroconv/tools/nwb_helpers/_metadata_and_file_helpers.py

…bfile

h-mayorquin · 2025-04-15T19:39:09Z

src/neuroconv/tools/nwb_helpers/_configure_backend.py

+            data_chunk_iterator_kwargs = dict()
+        else:
+            data_chunk_iterator_class = DataChunkIterator
+            data_chunk_iterator_kwargs = dict(buffer_size=np.prod(dataset_configuration.buffer_shape))


use math and not np.prod as that will overflow in most cases and screw up your buffer case.

Are these changes related to repacking?

use math and not np.prod as that will overflow in most cases and screw up your buffer case.

Ok, will do.

Are these changes related to repacking?

Yes, the DataChunkIterator solves this issue: hdmf-dev/hdmf#1170

Ah, thanks, OK. so hdmf does not support re-setting the chunking without copying so we do this through an iterator.

Mmm, you are solving your problem and this is great but in this PR as well as the one that you did for detecting compound data types I kind of feel that hdmf is kicking its complexity up to us for something that it should provide.

I guess that we are faster : P

Speaking of which, I ran into another issue with hdmf-zarr when it comes to compound dtypes -- see here: hdmf-dev/hdmf-zarr#272

…e hacky fixes

pauladkisson · 2025-04-18T17:08:46Z

Could you please add a chapter in the user guide?

Done!

codecov · 2025-04-18T18:18:15Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.61%. Comparing base (d09abd9) to head (1f1c629).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1003      +/-   ##
==========================================
+ Coverage   90.31%   90.61%   +0.30%     
==========================================
  Files         138      138              
  Lines        8876     9000     +124     
==========================================
+ Hits         8016     8155     +139     
+ Misses        860      845      -15

Flag	Coverage Δ
unittests	`90.61% <100.00%> (+0.30%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/neuroconv/tools/nwb_helpers/__init__.py	`100.00% <100.00%> (ø)`
...roconv/tools/nwb_helpers/_backend_configuration.py	`100.00% <100.00%> (ø)`
...nwb_helpers/_configuration_models/_base_backend.py	`100.00% <100.00%> (ø)`
..._helpers/_configuration_models/_base_dataset_io.py	`98.70% <100.00%> (+0.20%)`	⬆️
..._helpers/_configuration_models/_hdf5_dataset_io.py	`85.41% <100.00%> (+16.84%)`	⬆️
..._helpers/_configuration_models/_zarr_dataset_io.py	`96.96% <100.00%> (+13.33%)`	⬆️
.../neuroconv/tools/nwb_helpers/_configure_backend.py	`93.47% <100.00%> (+2.56%)`	⬆️
...roconv/tools/nwb_helpers/_dataset_configuration.py	`95.68% <100.00%> (+2.01%)`	⬆️
...nv/tools/nwb_helpers/_metadata_and_file_helpers.py	`92.52% <100.00%> (+3.42%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pauladkisson added 3 commits August 12, 2024 11:39

setup temp conversion script

7304229

added from_existing_neurodata_object for hdf5

4cc2a06

added get_existing_dataset_io_configurations

c33dfbf

CodyCBakerPhD reviewed Aug 13, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_configuration_models/_hdf5_dataset_io.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 13, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_configuration_models/_hdf5_dataset_io.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 13, 2024

View reviewed changes

temp_test.py Outdated Show resolved Hide resolved

pauladkisson added 5 commits August 13, 2024 11:49

added support for chunk_shape=None

80c1fba

added from_existing_nwbfile to HDF5BackendConfiguration

7ee6fc6

added get_existing_backend_configuration

dacdeea

added repack_nwbfile

dae04bf

fixed bug with export options and hdmf.container.Container.set_data_io

4ac6e33

pauladkisson mentioned this pull request Aug 14, 2024

[Feature]: Add support for overriding backend configuration in HDF5 datasets hdmf-dev/hdmf#1170

Closed

refactored from_ methods

ce267fb

pauladkisson added 2 commits August 14, 2024 16:54

template and changes optional

49f4262

added image series test

d93a5c5

bendichter reviewed Aug 15, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_backend_configuration.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 15, 2024

View reviewed changes

...est_backend_and_dataset_configuration/test_helpers/test_get_default_backend_configuration.py Show resolved Hide resolved

Merge branch 'main' into repack

ab8b22f

bendichter reviewed Aug 15, 2024

View reviewed changes

...est_backend_and_dataset_configuration/test_helpers/test_get_default_backend_configuration.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 15, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_metadata_and_file_helpers.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 15, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_metadata_and_file_helpers.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Aug 15, 2024

View reviewed changes

...est_backend_and_dataset_configuration/test_helpers/test_get_default_backend_configuration.py Outdated Show resolved Hide resolved

pauladkisson added 4 commits August 16, 2024 08:54

Merge branch 'main' into repack

934bb3a

added initial test

1ad69ca

updated signature to use file_path

04fb89c

added test for trials table (fails)

6dab477

bendichter reviewed Aug 16, 2024

View reviewed changes

src/neuroconv/tools/nwb_helpers/_metadata_and_file_helpers.py Outdated Show resolved Hide resolved

pauladkisson added 2 commits August 16, 2024 10:14

moved backend_configuration_changes to top of the fn

e6d31a6

consolidated configure_and_export_nwbfile into configure_and_write_nw…

7252449

…bfile

pauladkisson added 2 commits April 15, 2025 12:35

added existing configuration section to the docs

0c9aa55

added existing configuration section to the docs

cfbf523

h-mayorquin reviewed Apr 15, 2025

View reviewed changes

pauladkisson added 4 commits April 15, 2025 13:50

switched to math.prod

8069657

added test for pixel_mask for configure_backend

cba1738

updated configure_backend to work with compound dtypes

b6bf29f

added pixel_mask PlaneSegmentation to repack_nwbfile tests

c4d9c8d

h-mayorquin mentioned this pull request Apr 16, 2025

[Feature]: DatasetIOConfiguration.from_neurodata_object's buffer_gb should be settable #1299

Open

2 tasks

pauladkisson mentioned this pull request Apr 17, 2025

[Bug]: Error when pixel_mask PlaneSegmentation is exported hdmf-dev/hdmf-zarr#272

Closed

pauladkisson added 10 commits April 16, 2025 17:56

added tests for pixel_mask PlaneSegmentation (compound dtype) and som…

9f7f63e

…e hacky fixes

Merge branch 'main' into repack

299c5be

Merge branch 'main' into repack

da63806

added tests for edge cases

0a9b563

fully automatic backend for get_existing_dataset_io_configuration

0dfaf08

Merge branch 'main' into repack

fa7bcf9

de-hacked nested dataset case

360b7dd

Merge branch 'main' into repack

17e4893

py:method

34f9650

added section for repack

1f1c629

pauladkisson added 2 commits May 14, 2025 17:42

Merge branch 'main' into repack

ce16be7

switched from Union to |

61bce66

pauladkisson mentioned this pull request May 15, 2025

Writing Compound Dtype Dataset hdmf-dev/hdmf-zarr#276

Merged

4 tasks

pauladkisson added 2 commits July 22, 2025 17:44

Merge branch 'main' into repack

6e19842

updated temp_test

943c674

pauladkisson mentioned this pull request Jul 24, 2025

[Bug]: compound Dtypes do not support custom chunking and compression with Zarr. hdmf-dev/hdmf#1296

Open

pauladkisson added 3 commits July 24, 2025 08:47

updated zarr tests to use pixel_mask

4021ad9

updated test_repack_nwbfile_with_changes to use pixel_mask

d664c66

updated hdmf2zarr and zarr_to_hdmf tests to use pixel_mask

18108cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repack Nwb Files #1003

Repack Nwb Files #1003

pauladkisson commented Aug 13, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pauladkisson commented Aug 14, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

h-mayorquin Apr 15, 2025

Uh oh!

pauladkisson Apr 15, 2025

Uh oh!

h-mayorquin Apr 16, 2025 •

edited

Loading

Uh oh!

pauladkisson Apr 17, 2025

Uh oh!

pauladkisson commented Apr 18, 2025

Uh oh!

codecov bot commented Apr 18, 2025

Uh oh!

Uh oh!

Repack Nwb Files #1003

Are you sure you want to change the base?

Repack Nwb Files #1003

Conversation

pauladkisson commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pauladkisson commented Aug 14, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

h-mayorquin Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

pauladkisson Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

h-mayorquin Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pauladkisson Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

pauladkisson commented Apr 18, 2025

Uh oh!

codecov bot commented Apr 18, 2025

Codecov Report

Uh oh!

Uh oh!

pauladkisson commented Aug 13, 2024 •

edited

Loading

h-mayorquin Apr 16, 2025 •

edited

Loading