-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support upload of Zarr-backend NWB files #1310
Comments
Some work may also be needed with representation of NWB assets for Zarr back-end - no 'i' info button appears on the asset, and the API also fails to recognize the file as an asset, but rather every individual item blob is its own asset (this I had initially expected given the underlying structures of the Zarr store - but on Slack Roni had indicated that each Zarr chunk was not supposed to be a separate AssetBlob, which is what we are seeing below) from dandi.dandiapi import DandiAPIClient
client = DandiAPIClient(api_url="https://api-staging.dandiarchive.org/api")
dandiset = client.get_dandiset(dandiset_id="204919")
dandiset.get_asset_by_path(path="test_read_nwbfile/test_hdf5.nwb") works as expected, but dandiset.get_asset_by_path(path="test_read_nwbfile/test_zarr.nwb") gives ValueError Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/dandi/dandiapi.py:1155, in RemoteDandiset.get_asset_by_path(self, path)
1152 try:
1153 # Weed out any assets that happen to have the given path as a
1154 # proper prefix:
-> 1155 (asset,) = (
1156 a for a in self.get_assets_with_path_prefix(path) if a.path == path
1157 )
1158 except ValueError:
ValueError: not enough values to unpack (expected 1, got 0)
During handling of the above exception, another exception occurred:
NotFoundError Traceback (most recent call last)
Cell In[21], line 1
----> 1 dandiset.get_asset_by_path(path="test_read_nwbfile/test_zarr.nwb")
File /opt/conda/lib/python3.10/site-packages/dandi/dandiapi.py:1159, in RemoteDandiset.get_asset_by_path(self, path)
1155 (asset,) = (
1156 a for a in self.get_assets_with_path_prefix(path) if a.path == path
1157 )
1158 except ValueError:
-> 1159 raise NotFoundError(f"No asset at path {path!r}")
1160 else:
1161 return asset
NotFoundError: No asset at path 'test_read_nwbfile/test_zarr.nwb' and if I do list(dandiset.get_assets()) I see [RemoteBlobAsset(client=<dandi.dandiapi.DandiAPIClient object at 0x7fc5162c4400>, identifier='fd8e3782-b0c7-4bd5-89fe-e2acc0263744', path='test_read_nwbfile/test_hdf5.nwb', size=197512, created=datetime.datetime(2023, 7, 17, 15, 31, 55, 641893, tzinfo=datetime.timezone.utc), modified=datetime.datetime(2023, 7, 17, 15, 58, 44, 778333, tzinfo=datetime.timezone.utc), blob='6a61bab5-0662-49e5-be46-0b9ee9a27297', dandiset_id='204919', version_id='0.230717.1558'),
RemoteBlobAsset(client=<dandi.dandiapi.DandiAPIClient object at 0x7fc5162c4400>, identifier='a78dfc02-9cd5-402a-83c8-5006fb18d5e8', path='test_read_nwbfile/test_zarr.nwb/acquisition/ElectricalSeries/data/0.0', size=46, created=datetime.datetime(2023, 7, 17, 15, 57, 45, 173503, tzinfo=datetime.timezone.utc), modified=datetime.datetime(2023, 7, 17, 15, 58, 44, 787050, tzinfo=datetime.timezone.utc), blob='1419744b-36f6-4c28-a850-71d381fc90e5', dandiset_id='204919', version_id='0.230717.1558'),
RemoteBlobAsset(client=<dandi.dandiapi.DandiAPIClient object at 0x7fc5162c4400>, identifier='cd9faf76-cb4e-4849-b9eb-c838958676d1', path='test_read_nwbfile/test_zarr.nwb/acquisition/ElectricalSeries/electrodes/0', size=56, created=datetime.datetime(2023, 7, 17, 15, 57, 45, 215932, tzinfo=datetime.timezone.utc), modified=datetime.datetime(2023, 7, 17, 15, 58, 44, 795464, tzinfo=datetime.timezone.utc), blob='e8131c7e-095d-4242-ab4c-1658c8c3f5c5', dandiset_id='204919', version_id='0.230717.1558'),
RemoteBlobAsset(client=<dandi.dandiapi.DandiAPIClient object at 0x7fc5162c4400>, identifier='383ece04-8db0-4207-843a-86109259a5cd', path='test_read_nwbfile/test_zarr.nwb/acquisition/ElectricalSeries/starting_time/0', size=24, created=datetime.datetime(2023, 7, 17, 15, 57, 45, 222857, tzinfo=datetime.timezone.utc), modified=datetime.datetime(2023, 7, 17, 15, 58, 44, 909428, tzinfo=datetime.timezone.utc), blob='a1f46f4a-d8ec-4183-bd8c-8ed530e963e4', dandiset_id='204919', version_id='0.230717.1558'),
RemoteBlobAsset(client=<dandi.dandiapi.DandiAPIClient object at 0x7fc5162c4400>, identifier='871186e8-ac63-4c5e-b914-8b9246f7326a', path='test_read_nwbfile/test_zarr.nwb/file_create_date/0', size=56, created=datetime.datetime(2023, 7, 17, 15, 57, 45, 253174, tzinfo=datetime.timezone.utc), modified=datetime.datetime(2023, 7, 17, 15, 58, 44, 806273, tzinfo=datetime.timezone.utc), blob='9d7115fb-3133-437d-9168-7058e8fd84b6', dandiset_id='204919', version_id='0.230717.1558'),
.... and so on (the entire NWB file content listed out as separate blobs) |
The context the asset ID part is that I want to be able to stream the content using PyNWB can easily do this given the S3 asset of the HDF5, so I had thought that it would be just as easy if I had the asset ID of the Zarr folder (the 'test_zarr.nwb' file) |
@CodyCBakerPhD - i'm pretty positive what's happening here is the non-recognition of zarr on the CLI side and hence it's simply using the non-zarr route, which then the server interprets as individual blobs. so a fix on the CLI side that treats it as zarr would fix it. can you simply try adding the |
Well, that is interesting... Making a copy of the file with the name however, nothing new appears on the dandiset view: https://gui-staging.dandiarchive.org/dandiset/204919/0.230717.1558/files?location=test_read_nwbfile%2F or the API requests. I also confirmed the asset made it to the bucket by attempting re-upload, to which it responds by saying the file already exists and so does not re-upload it |
@CodyCBakerPhD - you have stumped me. perhaps @AlmightyYakob has an answer to why that asset doesn't show up. |
The file is present, the link you provided points to a previously published version, and so won't show any files uploaded to the draft verison. You can see the file here: https://gui-staging.dandiarchive.org/dandiset/204919/draft/files?location=test_read_nwbfile |
@AlmightyYakob Aha, yes that was it! Thank you for the sanity check Would this workflow perhaps 'simply work' if I just naively add ".nwb" to the list of accepted Zarr entities? I'll try that out locally and see |
Possibly related to #1307, but specific to NWB format files using the Zarr-backend
I'd like to be able to upload a
.nwb
file written using PyNWB+HDMF-Zarr to the DANDI archive, but thedandi upload
command was unable to recognize the file at all, and didn't even warn that it had been found and skipped for some reasonAn example file for testing purposes may be found here, which was forced through using devel options, specifically
--allow-any-path
The text was updated successfully, but these errors were encountered: