Skip to content

Conversation

@jsignell
Copy link
Member

@jsignell jsignell commented Dec 12, 2025

This PR makes the handling of chunks="auto" consistent between open_zarr and open_dataset(..., engine="zarr").

The handling of chunks still differs in open_zarr vs open_dataset(..., engine="zarr") in that the default in open_zarr is to use chunks={} and a chunk manager (aka dask) when available in your env. And in open_dataset the default is to use chunks=None (aka no chunks).

@github-actions github-actions bot added topic-backends topic-zarr Related to zarr storage library io labels Dec 12, 2025
@jsignell jsignell self-assigned this Dec 12, 2025
@jsignell jsignell marked this pull request as ready for review December 12, 2025 19:10
@jsignell jsignell requested review from dcherian and slevang December 17, 2025 18:57
@jsignell jsignell requested a review from keewis December 22, 2025 19:24
from_array_kwargs = {}

if chunks == "auto":
if chunks is _default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's issue a DeprecationWarning saying the default will switch to chunks=None to match open_dataset. If they want the current behaviour with dask et al, users should pass in chunks={}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was actually thinking that we not do that part. The issue that we are trying to fix with this PR is really that chunks="auto" means different things in open_zarr and open_dataset(,,, engine="zarr"). That was the part that felt deeply surprising to me and @norlandrhagen. As long as we fix that I don't think we need to change the default value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

io topic-backends topic-zarr Related to zarr storage library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent chunking between xr.open_zarr and xr.open_dataset(..., engine='zarr') with chunks="auto"

4 participants