-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan values when saving parq files with virtualize.to_kerchunk() #339
Comments
Hey there @QuentinMaz! Thanks for trying out VirtualiZarr and opening up a clear MRE. Definitely seems like an issue. On some initial digging it seems like:
I'll try to dig into this further, in the meantime if you're open to working a bit on the bleeding edge, you could try writing the references to Icechunk. It might take a bit of environment-fu since Kerchunk doesn't yet support Zarr V3. To keep a single environment, you could use the new Zarr-V3 compliant hdf5 reader, then write to Icechunk. from virtualizarr.readers.hdf import HDFVirtualBackend
vds = open_virtual_dataset('file.nc', backend=HDFVirtualBackend) |
Thanks for the comments and answer @norlandrhagen! Even though I am pretty sure to not be skilful enough to help through your investigations, I will have a look at the status of the issue to (try to) follow your progress :) Good luck! |
Hi,
I have used
virtualizarr
to concatenate several.nc
files into oneparq
one.I noticed that when I then open the saved dataset, the first value of its index is replaced with
nan
.I thus suspect that
virtualize.to_kerchunk()
might have a bug.Here how to replicate the issue:
I am a beginner and have therefore no idea of the cause...
The text was updated successfully, but these errors were encountered: