Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3] String support for v3 array #2268

Open
jhamman opened this issue Sep 29, 2024 · 0 comments
Open

[v3] String support for v3 array #2268

jhamman opened this issue Sep 29, 2024 · 0 comments
Assignees
Labels
bug Potential issues with the zarr-python library
Milestone

Comments

@jhamman
Copy link
Member

jhamman commented Sep 29, 2024

Zarr version

3.0.0.alpha

Numcodecs version

N/A

Python Version

N/A

Operating System

N/A

Installation

N/A

Description

Zarr Python 2.x contains support for writing string=type data to Zarr arrays. Unfortunately, strings were left out of the core dtypes from the v3 spec and so we have not implemented anything to replace this functionality in 3.0. There are discussions on how this could be done in the spec

And some related work here:

Steps to reproduce

writing strings to zarr works using zarr_format=2:

In [1]: import zarr

In [2]: store = {}

In [3]: data = ["this", "is", "a", "1d", "string"]

In [4]: arr = zarr.array(shape=(5, ), store=store, zarr_format=2, data=data)

In [5]: store['.zarray'].to_bytes()
Out[5]: b'{\n  "shape": [\n    5\n  ],\n  "fill_value": "",\n  "zarr_format": 2,\n  "order": "C",\n  "filters": null,\n  "dimension_separator": ".",\n  "compressor": null,\n  "chunks": [\n    5\n  ],\n  "dtype": "<U6"\n}'

but obviously not with zarr_format=3

In [6]: arr = zarr.array(shape=(5, ), store=store, zarr_format=3, data=data)

...
ValueError: Invalid V3 data_type: <U6

xref: pydata/xarray#9515

@jhamman jhamman added the bug Potential issues with the zarr-python library label Sep 29, 2024
@jhamman jhamman added this to the 3.0.0 milestone Sep 29, 2024
@rabernat rabernat self-assigned this Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
Status: Todo
Development

No branches or pull requests

2 participants