Reference X failed to fetch target #545

dkczk · 2025-02-24T14:56:24Z

Error Description

Hi folks,

I followed kerchunk's quick start for a single file but for some reason, reading the JSON file containing the chunk info at the end results in a "ReferenceNotReachable" error:

ReferenceNotReachable: Reference "lat/0" failed to fetch target ['c_gls_LAI300_201402280000_GLOBE_PROBAV_V1.0.1.nc', 972117, 376320]

At first glance, this seems like a path issue, but I already tried to open/read the dataset by only using the used paths and it worked (see steps below).

Code

Setup

I have to connect to two network drives. The first one is basically the file storage containing the NetCDF I want to chunk and the second drive should store the JSON file containing the chunk info. I connect to both of them via the smb protocol.

import fsspec
import xarray as xr
from kerchunk.hdf import SingleHdf5ToZarr
import ujson
import json
from pathlib import Path


host_data = "data.storage.server.com/path/to/data_dir"
host_cube = "json.storage.server.com/path/to/json_dir"
usr = "MyUsername"
pwd = "MyPassword"

fs = fsspec.filesystem(
    "smb",
    host=host_data,
    username=usr,
    password=pwd
    )

fs_2 = fsspec.filesystem(
    "smb",
    host=host_cube,
    username=usr,
    password=pwd
)

To confirm that the connection works, I open/read the desired file with:

file = "c_gls_LAI300_201402280000_GLOBE_PROBAV_V1.0.1.nc"
with fs.open(file, mode="rb") as f:
    data = xr.open_dataset(f)

data

I'm continuing to follow the proposed steps in the guide with slight variations e.g. defining the name of the JSON file first as I only want to test the procedure:

def gen_json(file_url):
    with fs.open(file_url, **so) as infile:
        h5chunks = SingleHdf5ToZarr(infile, file_url, inline_threshold=500)

        with fs_2.open(outf, 'wb') as f:
            f.write(ujson.dumps(h5chunks.translate()).encode());

so = dict(mode='rb')
outf = "CGLS_LAI_20140228.json"

gen_json(file)

After checking the location I can confirm that the JSON file is there. So I proceed with loading the JSON into a variable:

with fs_2.open(outf) as f:
    reference = json.load(f)

The last step is to open the file using the opened JSON file. According to the guide with some adaption to my smb case I did:

data_chunk = xr.open_dataset(
    "reference://",
    engine="zarr",
    backend_kwargs={
        "consolidated": False,
        "storage_options": {
            "fo": reference,
            "remote_protocoll": "smb",
            "remote_options": {
                "host": host_data,
                "username": usr,
                "password": pwd
            }
        }
    }
)

But after executing this I get the error from above.

Debugging Attempts

I checked two different things. I tried to open the file like in the beginning and I had a look into the JSON, searching for incorrect paths but I found nothing.

File In Question

The file I'm trying to open is a Copernicus Global Land Service Leaf Area Index file. It's described via two dimensions (lon, lat), contains two coordinate axis' (lon, lat), six data variables and two Pandas indexes.

System Info

OS

NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"

Python

Version=3.9.19

Relevant Miniconda Packages

Name Version Build Channel
fsspec 2024.6.1 pyhff2d567_0 conda-forge
h5netcdf 1.3.0 pyhd8ed1ab_0 conda-forge
h5py 3.8.0 mpi_mpich_py39hadaddcd_0 conda-forge
hdf4 4.2.15 h9772cbc_5 conda-forge
hdf5 1.12.2 mpi_mpich_h5d83325_0 conda-forge
kerchunk 0.2.7 pyhd8ed1ab_0 conda-forge
netcdf4 1.6.3 nompi_py39h8b3a7bc_100 conda-forge
smbprotocol 1.15.0 pyhd8ed1ab_0 conda-forge
ujson 5.10.0 py39h84cc369_0 conda-forge
xarray 2024.7.0 pyhd8ed1ab_0 conda-forge
zarr 2.18.2 pyhd8ed1ab_0 conda-forge

The text was updated successfully, but these errors were encountered:

martindurant · 2025-02-24T15:08:58Z

It's probably worth making the reference filesystem explicitly for debugging:

fs = fsspec.filesystem("reference", fo=reference, remote_protocol=, remote_options=)

and checking out what files it thinks it can see and read. You can also check out the .fss attribute to see what filesystems it's trying to operate on.

It's also a good way to turn on logging, something like

fsspec.utils.setup_logging(logger_name="fsspec.reference")

Was there further information in the exception/traceback?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference X failed to fetch target #545

Reference X failed to fetch target #545

dkczk commented Feb 24, 2025

martindurant commented Feb 24, 2025

Reference X failed to fetch target #545

Reference X failed to fetch target #545

Comments

dkczk commented Feb 24, 2025

Error Description

Code

Setup

Debugging Attempts

File In Question

System Info

OS

Python

Relevant Miniconda Packages

martindurant commented Feb 24, 2025