Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: More principled errors #75

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

ericphanson
Copy link
Member

@ericphanson ericphanson commented Oct 18, 2023

closes #74

This captures my thoughts on what I think EDF.jl should do:

  • writing onset/durations is different than header files and has different requirements. Use a different function for that, and use more precision (e.g. match EDFlib's 100ms precision)
  • errors should include which header field was attempting to be written, important for debugging
  • 0 isn't special; too much roundoff of encoding parameters causes errors, regardless of what value we round to (see OndaEDF should use EDF-representable encoding parameters (physical min & max) OndaEDF.jl#90). Since the caller is charge of encoding, the best we can do is check the values they give. Here I allow absolute error up to 1e-3 but I don't know if that's too strict or too loose.
  • I don't think scientific notation (Allow scientific notation in written numbers #70) is very important. I guess it could be if callers need to have very large or very small precise encoding parameters. More important is that the error should be if the roundtripping error is too large to reliably encode/decode or not.

This should probably only be merged/released after OndaEDF reliably chooses "good" encoding parameters that don't incur too much roundoff (xref beacon-biosignals/OndaEDF.jl#90)), and this should likely be a breaking release of EDF.jl.

TODO:

  • decide tolerance value. If we want to be super strict, could even be 0, requiring our callers to give exactly representable encoding values.
  • decide whether to use relative or absolute tolerance
  • decide if file.header.seconds_per_record requires a different tolerance or not. So far I've only been thinking about the 4 encoding parameters, but that is the remaining floating point value

@ericphanson ericphanson changed the title RFC: More principled rounding RFC: More principled errors Oct 18, 2023
@codecov
Copy link

codecov bot commented Oct 18, 2023

Codecov Report

Merging #75 (f6fbea2) into main (2eef49f) will increase coverage by 0.48%.
The diff coverage is 97.82%.

@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
+ Coverage   95.59%   96.07%   +0.48%     
==========================================
  Files           4        4              
  Lines         295      306      +11     
==========================================
+ Hits          282      294      +12     
+ Misses         13       12       -1     
Files Coverage Δ
src/types.jl 89.65% <ø> (-1.78%) ⬇️
src/write.jl 96.94% <97.82%> (+1.33%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ericphanson
Copy link
Member Author

BTW, to choose the tolerance, given the 4 encoding parameters as Float64, we can actually compute the error in roundtripped sample (after encoding & decoding with rounded values) and choose a tolerance in data-space rather than a tolerance in encoding-parameter-space (which would be actually "principled"). For example, we could have a principle that the additional maximal error incurred by "rounding of encoding parameters" should be less than half of the error incurred by quantization in the first place. Given digital min/max, physical min/max, we should be able to actually compute both errors analytically. This would require evaluating all 4 together, rather than each separately.

@ericphanson
Copy link
Member Author

@ararslan would love to know your thoughts here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EDF.jl does not allow truncating header values to 0, even when it should
1 participant