-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slicing not supported with Indexer when trying to open many MUR files. #360
Comments
I'm not sure why this operation requires indexing at all... Can you open the files individually and then concatenate them to try to narrow down where / for which dataset the error is occurring? |
So I did that approach and narrowed it down a bit. This code snippet uses the same Also FYI I checked the source netcdfs and verified that one does have an extra variable to make sure this isn't a dmrpp reader error import dask
import virtualizarr as vz
fs = earthaccess.get_s3_filesystem(results=results)
fs.storage_options["anon"] = False # type: ignore
open_ = dask.delayed(vz.open_virtual_dataset) # type: ignore
vdatasets = []
# Get list of virtual datasets (or dask delayed objects)
for g in results:
vdatasets.append(
open_(
filepath=g.data_links(access="direct")[0] + ".dmrpp",
filetype="dmrpp", # type: ignore
indexes={},
reader_options={"storage_options": fs.storage_options}, # type: ignore
)
)
vdatasets = dask.compute(vdatasets)[0]
xr.combine_nested(vdatasets[495:497], concat_dim="time",
coords="minimal",
compat="override",
combine_attrs="drop_conflicts") print(vdatasets[495])
print(vdatasets[496]) <xarray.Dataset> Size: 5GB
Dimensions: (time: 1, lat: 17999, lon: 36000)
Coordinates:
time (time) int32 4B ManifestArray<shape=(1,), dtype=int32, ...
lat (lat) float32 72kB ManifestArray<shape=(17999,), dtype=...
lon (lon) float32 144kB ManifestArray<shape=(36000,), dtype...
Data variables:
mask (time, lat, lon) int8 648MB ManifestArray<shape=(1, 179...
sea_ice_fraction (time, lat, lon) int8 648MB ManifestArray<shape=(1, 179...
dt_1km_data (time, lat, lon) int8 648MB ManifestArray<shape=(1, 179...
analysed_sst (time, lat, lon) int16 1GB ManifestArray<shape=(1, 1799...
analysis_error (time, lat, lon) int16 1GB ManifestArray<shape=(1, 1799...
Attributes: (12/47)
Conventions: CF-1.5
title: Daily MUR SST, Final product
summary: A merged, multi-sensor L4 Foundation SST anal...
references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...
institution: Jet Propulsion Laboratory
history: created at nominal 4-day latency; replaced nr...
... ... <xarray.Dataset> Size: 4GB
Dimensions: (time: 1, lat: 17999, lon: 36000)
Coordinates:
time (time) int32 4B ManifestArray<shape=(1,), dtype=int32, ...
lat (lat) float32 72kB ManifestArray<shape=(17999,), dtype=...
lon (lon) float32 144kB ManifestArray<shape=(36000,), dtype...
Data variables:
mask (time, lat, lon) int8 648MB ManifestArray<shape=(1, 179...
sea_ice_fraction (time, lat, lon) int8 648MB ManifestArray<shape=(1, 179...
analysed_sst (time, lat, lon) int16 1GB ManifestArray<shape=(1, 1799...
analysis_error (time, lat, lon) int16 1GB ManifestArray<shape=(1, 1799...
Attributes: (12/47)
Conventions: CF-1.5
title: Daily MUR SST, Final product
summary: A merged, multi-sensor L4 Foundation SST anal...
references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...
institution: Jet Propulsion Laboratory
history: created at nominal 4-day latency; replaced nr...
... ... |
haha I knew this was an issue with model output (variables can appear and disappear) but it's news to me that it can happen with nominally-standard-archival data. FWIW I think Xarray will try to reindex |
@dcherian Yup, the error output if I change the range of results to [545:555] is: NotImplementedError: Doesn't support slicing with (array([-1, -1, -1, 0, 1, 2, 3, 4, 5, 6]), slice(None, None, None), slice(None, None, None)) |
Huh. I'm not sure I fully understand yet - is this an example of trying to auto-generate indexes? Or an xarray "virtual variable" thing (not virtual variable in the virtualizarr sense). |
xref: nsidc/earthaccess#903
I'm trying to open 10 years of high resolution MUR data using earthaccess, the upcoming example for 1 month (32 granules) works fine but when I try to open ~1000 I get the following error:
I'm not sure if there is something wrong with the data (misaligned grid in one granule or something like that). @TomNicholas mentioned that this may be related to #51
The full example(earthaccess has to be installed from source):
The text was updated successfully, but these errors were encountered: