Skip to content

floating point issues in CoTeDe #214

@bkatiemills

Description

@bkatiemills

Some of the warnings CoTeDe is throwing appear to be due to floating point calculations underlying CoTeDe's spike check. Running only CoTeDe_spike.py without parallelization, random profiles generate errors of the type:

/AutoQC/miniconda/lib/python2.7/site-packages/cotede/qctests/spike.py:63: RuntimeWarning: invalid value encountered in greater
  flag[np.nonzero(self.features['spike'] > threshold)] = flag_bad
/AutoQC/miniconda/lib/python2.7/site-packages/cotede/qctests/spike.py:64: RuntimeWarning: invalid value encountered in less_equal
  flag[np.nonzero(self.features['spike'] <= threshold)] = flag_good

Re-running from scratch will show the same error, but in a different profile (see sample of quota data I was doing this on here: quota_subset.txt)

Digging in a bit, the invalid values are NaNs appearing in the masked array self.features['spike']. If I dump this array without the mask for a profile that produced this error, I get:

[  6.93226434e-310  -1.20000000e-001   1.20000000e-001   1.40000000e-001
  -1.35000000e+000  -6.30000000e-001               nan]

Then, if I run the exact same thing again, the test doesn't throw a warning on the same profile and instead produces

[  6.94815498e-310  -1.20000000e-001   1.20000000e-001   1.40000000e-001
  -1.35000000e+000  -6.30000000e-001   6.94812563e-310]

The first and last entries in the array are coming out a bit differently each time, sometimes tripping over into NaN and potentially corrupting the result of this test.

@castelao @s-good thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions