Replies: 4 comments
-
|
I think this is what you want @rly import zarr
https_url = "https://dandiarchive.s3.amazonaws.com/zarr/d097af6b-8fd8-4d83-b649-fc6518e95d25/"
z = zarr.open_consolidated(https_url, mode="r")
print(list(z.keys()))Use open_consolidated instead of open. |
Beta Was this translation helpful? Give feedback.
-
|
Ah, that works. Thanks @magland ! If we want to use the HTTPS URL, can we only open zarr stores that have consolidated metadata? |
Beta Was this translation helpful? Give feedback.
-
Yes, otherwise without .zmetadata it's impossible to get the keys. |
Beta Was this translation helpful? Give feedback.
-
|
I wonder if there is a way to configure AWS S3 to provide simple http/https index as regular websites do, and then I believe (expect?) for zarr libraries to be able to open them, right? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
As I reported on Slack:
Dandiset 000719 contains NWB zarr stores. File
sub-npI3_ses-20190421_behavior+ecephys_rechunk.nwb.zarrhas this asset metadatahttps://api.dandiarchive.org/api/dandisets/000719/versions/draft/assets/58de1d1c-278f-4278-953a-bf8790de4c69/
which lists the content URL:
https://dandiarchive.s3.amazonaws.com/zarr/d097af6b-8fd8-4d83-b649-fc6518e95d25/
This file has consolidated metadata.
The following code uses zarr-python 2.18.7 to open the Zarr store using that HTTPS URL but finds no keys:
But there are many keys there. You can see that in the browser by going to a URL with the same base and a known file name, such as
.zmetadata, e.g., https://dandiarchive.s3.amazonaws.com/zarr/d097af6b-8fd8-4d83-b649-fc6518e95d25/.zmetadata does return dataHowever, if I use the S3 URL for this Zarr store, I can see all the keys:
I believe I should be able to access a Zarr store using an HTTPS URL in general. I don't know why these would be any different.
This was also reported in #1745 but it seems like that discussion diverged.
@magland wrote:
What is this option that should not be enabled? Is there a way around that?
@satra wrote:
There was no follow-up on this. Is there a special HTTP endpoint for Zarr stores?
@kabilar adds on Slack that with the HTTPS URL, the data is accessible if you know the keys. It just doesn't display the keys properly and lists 0 arrays/groups within a group:
I can use the S3 URL fine. But the asset page lists the HTTP URL and not the S3 URL. If you cannot use programmatically access a zarr store using the HTTPS URL, then perhaps the S3 URL should be shared on the asset metadata page or it should be documented somewhere how to transform the URL.
There may also be a performance difference between the two. Early benchmarking results suggest that reading NWB HDF5 files using fsspec with HTTPS URLs is faster than with S3 URLs, and zarr-python uses fsspec under the hood.
Beta Was this translation helpful? Give feedback.
All reactions