Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarr Python 3 tracking issue #9515

Open
2 of 4 tasks
jhamman opened this issue Sep 18, 2024 · 5 comments
Open
2 of 4 tasks

Zarr Python 3 tracking issue #9515

jhamman opened this issue Sep 18, 2024 · 5 comments
Labels
topic-zarr Related to zarr storage library upstream issue

Comments

@jhamman
Copy link
Member

jhamman commented Sep 18, 2024

What is your issue?

Zarr-Python 3.0 is getting close to a full release. This issue tracks the integration of the 3.0 release with Xarray.

Here's a running list of issues we're solving upstream related to integration with Xarray:

Special shout out to @TomAugspurger has been front running a lot of this 🙌.

@jhamman jhamman added the needs triage Issue that has not been reviewed by xarray team member label Sep 18, 2024
@TomNicholas TomNicholas added upstream issue topic-zarr Related to zarr storage library and removed needs triage Issue that has not been reviewed by xarray team member labels Sep 18, 2024
@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 18, 2024

Testing this out

There multiple active branches right now, but you can get a usable xarray + zarr-python 3.x with these two branches:

You can install these with:

pip install git+https://github.com/TomAugspurger/zarr-python@xarray-compat git+https://github.com/TomAugspurger/xarray/@fix/zarr-v3

Work Items

We require some changes on both zarr-python and xarray. I'm pushing the zarr ones to TomAugspurger/zarr-python@xarary-compat and the xarray ones to `TomAugspurger/xarray@fix/zarr-v3.

zarr-python

xarray

Most of these are in my PR at #9552

Fixed issues

Things to investigate:

  • separate store / chunk_store
  • writing a subset of regions

@dcherian
Copy link
Contributor

@TomAugspurger are you able to open a WIP PR with in-progress work. It'd be nice to see what's needed

@TomAugspurger
Copy link
Contributor

Sure, #9552 has that.

@TomAugspurger
Copy link
Contributor

Question for the group: does anyone object to xarray continuing to write Zarr V2 datasets by default? I hesitate to have xarray's default be different from zarr-python's, but that would relive some pressure to address #5475 quickly, since v2 datasets should be round-tripable.

@TomAugspurger
Copy link
Contributor

I think that support for reading Zarr V2 datasets with zarr-python v3 is close to being ready. I updated #9515 (comment) with some instructions on how to install two branches if anyone is able to test that out:

In [4]: xr.open_dataset("abfs://daymet-zarr/annual/hi.zarr", engine="zarr", storage_options={"account_name": "daymeteuwest"})
Out[4]:
<xarray.Dataset> Size: 137MB
Dimensions:                  (y: 584, x: 284, time: 41, nv: 2)
Coordinates:
    lat                      (y, x) float32 663kB ...
    lon                      (y, x) float32 663kB ...
  * time                     (time) datetime64[ns] 328B 1980-07-01T12:00:00 ....
  * x                        (x) float32 1kB -5.802e+06 ... -5.519e+06
  * y                        (y) float32 2kB -3.9e+04 -4e+04 ... -6.22e+05
...
    start_year:        1980

In [5]: xr.open_dataset("s3://cmip6-pds/CMIP6/ScenarioMIP/AS-RCEC/TaiESM1/ssp126/r1i1p1f1/Amon/clt/gn/v20201124", engine="zarr", storage_options={"anon": True})
Out[5]:
<xarray.Dataset> Size: 228MB
Dimensions:    (time: 1032, lat: 192, lon: 288, bnds: 2)
Coordinates:
  * lat        (lat) float64 2kB -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
...
    variant_label:             r1i1p1f1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-zarr Related to zarr storage library upstream issue
Projects
None yet
Development

No branches or pull requests

4 participants