-
-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse 0 fill value as "" for str dtype #2798
Conversation
src/zarr/core/array_spec.py
Outdated
# No zarr_format available here... | ||
# fill_value_parsed = parse_fill_value(fill_value, dtype, zarr_format=2) | ||
fill_value_parsed = fill_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what to make of this. The fill value is parsed way up the stack from here or else the dtype we have to inspect is lost. Should I just toss this out?
2e7a739
to
fa2f8f5
Compare
fa2f8f5
to
0afbf7e
Compare
Sorry that all the parsing was confusing, we should definitely avoid parsing the same things twice! Could you open an issue for the values that are getting parsed multiple times? |
@@ -150,9 +150,11 @@ def parse_shapelike(data: int | Iterable[int]) -> tuple[int, ...]: | |||
return data_tuple | |||
|
|||
|
|||
def parse_fill_value(data: Any) -> Any: | |||
def parse_fill_value(fill_value: Any, dtype: Any, zarr_format: ZarrFormat) -> Any: | |||
if zarr_format == 2 and (dtype is str or dtype == "str") and fill_value == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs at least a comment explaining why we are doing this. what was zarr-python 2's behavior in this situation?
Looks like this has been superseded by 2799 |
@moradology #2799 is resolving a separate issue ( |
This PR implements
fill_value
parsing for v2 metadata such that if the dtype specified isstr
, and 0 is the providedfill_value
, this value is replaced with""
Closes #2792
TODO:
docs/user-guide/*.rst
changes/
As a side note, I was a bit surprised at how many places there were that independently seemed to be concerned with metadata parsing. This... was slightly confusing, as values undergo multiple
parse_*
steps after having been parsed already (so that it is unclear without printing stack and dropping into a debugger when e.g. dtype was the provided dtype vs when it was the already parsed dtype. In this case, as the dtype ofobject
is produced byparse_dtype
ininit_array
, the only place where it was possible to check on thedtype
specified by the user was at that point (way up the stack from calls into v2.py). Not sure I quite understand the organization (though I'm sure there are good reasons for it; perhaps historical and compatibility related) and I'd be very interested in a rough sketch of the motivation that I'm surely failing to see