Improve precision of horner_polynomial #280

hvdijk · 2024-01-05T11:45:28Z

Overview

Improve precision of horner_polynomial

Reason for change

We were previously implicitly relying on LLVM's behaviour of promoting f16 operations to f32 and getting extra precision that amounts to getting the same results as FMA. We never did that consistently (it varied between platforms) and we now consistently do not do that, resulting in incorrect results for at least half-precision pow. Use FMA to fix that.

Description of change

This MR consists of two commits that are not intended to be squashed, and are best reviewed separately. There is a large mostly mechanical NFC commit to simplify how horner_polynomial is used, followed by a small commit that fixes its implementation.

Anything else we should know?

With this change, we pass OpenCL CTS fp16-staging branch's testing for half-precision pow.

Checklist

Read and follow the project Code of Conduct.
Make sure the project builds successfully with your changes.
Run relevant testing locally to avoid regressions.
Run clang-format-16 (the most
recent version available through pip) on all modified code.

The common use of horner_polynomial is to take an array of coefficients, and to specify that all coefficients in the array are used. This commit changes horner_polynomial to allow inference of the size.

We were previously implicitly relying on LLVM's behaviour of promoting f16 operations to f32 and getting extra precision that amounts to getting the same results as FMA. We never did that consistently (it varied between platforms) and we now consistently do not do that, resulting in incorrect results for at least half-precision pow. Use FMA to fix that. tanpi and tgamma have hardcoded exceptions where the approximation was known to not be sufficiently precise. These exceptions are updated.

hvdijk added 2 commits January 5, 2024 11:29

[NFC] Make horner_polynomial less error-prone.

afb80cb

The common use of horner_polynomial is to take an array of coefficients, and to specify that all coefficients in the array are used. This commit changes horner_polynomial to allow inference of the size.

hvdijk force-pushed the horner-precision branch from 2604f46 to c59ec90 Compare January 5, 2024 13:30

coldav approved these changes Jan 8, 2024

View reviewed changes

hvdijk merged commit f732b95 into uxlfoundation:main Jan 8, 2024
3 checks passed

hvdijk deleted the horner-precision branch January 8, 2024 10:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve precision of horner_polynomial #280

Improve precision of horner_polynomial #280

hvdijk commented Jan 5, 2024

Improve precision of horner_polynomial #280

Improve precision of horner_polynomial #280

Conversation

hvdijk commented Jan 5, 2024

Overview

Reason for change

Description of change

Anything else we should know?

Checklist