Unpacked values from scaled/offset dataset are all NaN. #1312

rschmunk · 2024-03-05T05:17:24Z

Versions impacted by the bug

Any build of version 5.5.4 from the past month.

What went wrong?

A user wrote that a plot of his data was coming up blank when using a just-downloaded Panoply. A quick check revealed that Panoply was seeing all NaN values even though an ncdump showed good values. Some further testing demonstrated that this was also occurring with a copy of Panoply from a month ago, but that a copy from the very end of December 2023 was plotting as expected. In all three cases, the Panoply release was using an up-to-date build of the netcdf-Java 5.x-maint snapshot.

The dataset packs the data as shorts and specifies a scale value and an offset. Panoply acquires the dataset in enhanced mode and accepts whatever values netCDF-Java says are there.

If I modify Panoply to acquire the dataset in NOT enhanced mode, then netCDF-Java reports the unpacked short values and a plot is successfully made. If I build a copy of Panoply using a copy of netcdfAll that I built on December 28 and tell it to acquire the dataset in enhanced mode, the unpacked data values are correctly reported and a plot is successfully made.

Looking at the commit history on the 5.x-maint branch, it looks like there were some modifications to unpacking data made about January 12, which seems a likely point where the problem is occurring. Nothing I noticed in the changes then jumps out at me as the obvious cause, but I haven't dug very deep yet.

Not sure if it's at all meaningful, but the scale and offset values are reeeaally small.

A copy of the example dataset that revealed the issue can be downloaded from Dropbox. It's a CF-1.6 dataset with lon-lat data on a relatively small array covering the re4gion around Mexico City.

Relevant stack trace

No response

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed?
Note: the test datasets are publicly accessible without restriction.

Yes

Code of Conduct

I agree to follow the UCAR/Unidata Code of Conduct

haileyajohnson · 2024-03-07T19:16:43Z

@rschmunk just to clarify, the place where Panoply is getting incorrect data is from a call to NetcdfDatasets.openDataset(filename) with enhanced=all?

rschmunk · 2024-03-08T01:36:41Z

Actually

NetcdfFile njfile = NetcdfFiles.open (filePathStr);
NetccdfDataset njdataset = NetcdfDatasets.enhance (njfile, ENHANCEMENT, null);

where ENHANCEMENT is the default enhancement plus allowing for an "incomplete" coordinate system.

If you're wondering why I acquire the dataset in two steps, it's actually because there's also a step where I acquire an unenhanced version in order to see the original CDL description.

rschmunk · 2024-03-09T04:26:35Z

@haileyajohnson, I tracked this down, and it has to do with the changes made to the ConvertMissing filter on January 12. I'm not sure which commit in particular was the coup de grace, as there were about 6 of them that day.

TL;DR - the unpacked data values from the sample dataset are all much smaller than Misc.defaultMaxRelativeDiffFloat.

Long version...

As noted above, the scale and offset values in the sample dataset are really small. Tiny even. The scale factor is of order 1E-12 and the offset of order 1E-7, and after unpacking, the output data results are of order 1E-8.

The dataset also specifies that the _FillValue is the same as the missing_value. Unpacked, the fill/missing value is ~ 0 (actually, also of order 1E-12).

The problem then is that ConvertMissing.isMissing receives an unpacked data value, checks if it is less than Misc.defaultMaxRelativeDiffFloat away from the unpacked fill value. Since that default difference is of order 1E-5, then of course the method is returning true. Every value in the array is being deemed invalid and the convert method sends back NaN for the entire array.

So it seems like a smaller difference value needs to be used in the test when dealing with such tiny scale factors and offsets.

haileyajohnson · 2024-03-11T15:39:26Z

Thanks for tracking that down! We did change the fuzzyequals comparison to use a relative diff instead of absolute diff, and that's what broke this case - I'm planning on reverting that change. But another issue I think is that we need to do the missing and fill value comparisons with unpacked data, so that the fuzyequals implementation doesn't matter. I should have a fix up shortly. Thanks for all your help!

rschmunk · 2024-03-11T20:57:28Z

@haileyajohnson, I thought about trying to hack up a test of modifying ConvertMissing to allow for specifying a more flexible relative difference in its fuzzy-equals test. But I took a second look at how many parameters there already are in the constructor and said, "Nah."

rschmunk · 2024-03-13T01:01:00Z

I downloaded a fresh snapshot and built it into Panoply. It plots the sample data as expected, so it looks like #1315 was what was needed.

rschmunk added the bug Something isn't working label Mar 5, 2024

haileyajohnson self-assigned this Mar 5, 2024

This was referenced Mar 12, 2024

Fix order of Enhancements to use packed fillValues for comparisons to packed data #1314

Closed

revert switch to fuzzy equals #1315

Merged

haileyajohnson closed this as completed in #1315 Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unpacked values from scaled/offset dataset are all NaN. #1312

Unpacked values from scaled/offset dataset are all NaN. #1312

rschmunk commented Mar 5, 2024

haileyajohnson commented Mar 7, 2024

rschmunk commented Mar 8, 2024 •

edited

Loading

rschmunk commented Mar 9, 2024 •

edited

Loading

haileyajohnson commented Mar 11, 2024

rschmunk commented Mar 11, 2024

rschmunk commented Mar 13, 2024

Unpacked values from scaled/offset dataset are all NaN. #1312

Unpacked values from scaled/offset dataset are all NaN. #1312

Comments

rschmunk commented Mar 5, 2024

Versions impacted by the bug

What went wrong?

Relevant stack trace

Relevant log messages

If you have an example file that you can share, please attach it to this issue.

Code of Conduct

haileyajohnson commented Mar 7, 2024

rschmunk commented Mar 8, 2024 • edited Loading

rschmunk commented Mar 9, 2024 • edited Loading

haileyajohnson commented Mar 11, 2024

rschmunk commented Mar 11, 2024

rschmunk commented Mar 13, 2024

rschmunk commented Mar 8, 2024 •

edited

Loading

rschmunk commented Mar 9, 2024 •

edited

Loading