Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nan fillvalue attributes written by xarray #176

Open
veenstrajelmer opened this issue Nov 20, 2023 · 4 comments
Open

nan fillvalue attributes written by xarray #176

veenstrajelmer opened this issue Nov 20, 2023 · 4 comments

Comments

@veenstrajelmer
Copy link
Collaborator

I noticed that xugrid writes nan fillvalue attributes for variables without fillvalue defined. After a quick assessment I see that this is xarray behaviour and it is also happening in their code. A nan fillvalue does not make sense to me and it causes issues in some cases, like this one. In dfm_tools there is a workaround in place that pops nan fillvalue attrs upon reading, but I believe we should avoid writing them. It might be related to xarray not decoding default fillvalues if no fillvalue attribute is present. Either way, let's discuss what is expected/desired behaviour here.

The code below reads a D-FlowFM output which has no nan fillvalue attributes. When writing it with .to_netcdf() there are nan fillvalue attributes. I have tried this also with xugrid testdata, but those datasets already contain nan fillvalue attributes upon reading:

import dfm_tools as dfmt
import xarray as xr
import numpy as np

file_nc = dfmt.data.fm_curvedbend_map(return_filepath=True)
file_out = "temp_fillvals_map.nc"
ds_org = xr.open_dataset(file_nc)
ds_org.to_netcdf(file_out)
ds_out_xr = xr.open_dataset(file_out)

print("nan fillvalue attrs in original dataset:")
ds = ds_org
count_xr = 0
for varn in ds.variables:
    if '_FillValue' in ds[varn].encoding:
        if np.isnan(ds[varn].encoding['_FillValue']):
            print(varn, ds[varn].encoding['_FillValue'])
            count_xr += 1

print()

print("nan fillvalue attrs in dataset written by xugrid/xarray:")
ds = ds_out_xr
count_xr = 0
for varn in ds.variables:
    if '_FillValue' in ds[varn].encoding:
        if np.isnan(ds[varn].encoding['_FillValue']):
            print(varn, ds[varn].encoding['_FillValue'])
            count_xr += 1
@JoerivanEngelen
Copy link
Contributor

Just checking @veenstrajelmer, does pydata/xarray#8713 fix this issue?

@veenstrajelmer
Copy link
Collaborator Author

veenstrajelmer commented Feb 7, 2024

I do not think so, this is about (lack of) precision of scale_factor attrs and is unrelated to {"_FillValue":np.nan} in the netcdf file attributes. I found the list of issues @Huite and myself collected a while ago. They are listed in this issue: Deltares/dfm_tools#490. This one also lists another nan-fillvalue issue (#125), which we closed as "wont do". I guess it makes sense to do the same with this issue.

@kmuehlbauer
Copy link

@JoerivanEngelen @veenstrajelmer Just stepping by from pydata/xarray#8713. Correct, that PR won't fix this particular issue. But I'm aiming for solving the default fillvalue issue over at xarray, too. This is moving in small steps, though.

@veenstrajelmer
Copy link
Collaborator Author

@kmuehlbauer thank you for clarifying this. We were considering to fix it in this package (prevent writing of _Fillvalue:np.nan attributes and add encoding of default fillvalues per datatype if no _FillValue is supplied, both in line with the cf-convention. Do you mean that this behaviour will in the future be taken care of by xarray? In that case we can indeed consider to temporarily put it in xugrid to anticipate future behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants