Shaman is a C++11 library that use operator overloading and a novel method to evaluate the numerical accuracy of an application.
It has been designed to target high-performance simulations and, thus, we insured that it is not only accurate but also:
- fast enough to be tested on very large simulations1
- compatible with all of C++ mathematical functions
- threadsafe and compatible with both OpenMP and MPI
Once Shaman is compiled and linked to your project (see the Installation section below), you just need to #include <shaman.h>
and replace your floating point datatypes (float
, double
, long double
) with their Shaman equivalents (Sfloat
, Sdouble
, Slong_double
):
#include <shaman.h>
Sdouble largeNum = 1e30;
Sdouble smallNum = 1;
Sdouble sum = (largeNum + smallNum) - largeNum; // should be 1
std::cout << "result as displayed by shaman: " << sum << '\n' // notice that Shaman displays only significant digits
<< "result that would have been obtained without Shaman: " << sum.number << " == " << static_cast<double>(sum) << '\n'
<< "approximation of the numerical error: " << sum.error << '\n'
<< "approximation of the number of significant digits: " << sum.digits() << std::endl;
Mathematical functions are defined in the Sstd
namespace, additional traits and definitions can be included from the headers in the shaman/helpers
folder to help when using MPI, Eigen or Trilinos.
A test is said unstable if numerical error could have impacted its output (which can change the branch being taken by a code and deeply impact its behaviour).
To detect unstable tests, pass the SHAMAN_UNSTABLE_BRANCH
flag at compile time.
They will then be monitored and counted.
You can get the exact location of the unstable tests by either setting a breakpoint on the Shaman::unstability
function (which will be called whenever an unstable test is detected) or running the code with the shaman_profiler.py
(you will find it in the tools/shaman_profiler
folder) in order to get a summary of the number and position of all unstable branches (note that this script adds a significant computing time overhead).
Shaman insures that implicit cast are done as they would have been done by their underlying types.
However, similarly to std::complex
, some mixed precision operation that are legal with the original types might be rejetted by their intrumented equivalent in the absence of an explicit cast (such as Sfloat(1.5f) + double(1.5)
).
To solve the problem, one just need to add an explicit cast.
Shaman is able to propagate nan
and inf
correctly but they might play havoc with the numerical error computation.
This is not a problem in general as the output is likely to be nan
meaning that its numerical error is meaningless.
However, some computation (such as a number divided by infinity) manage to recover gracefully without resulting in nan
in the output.
These computation might lead to ~nan~
being displayed, meaning that the number is finite but its numerical error is not-a-number.
You can use the SHAMAN_FLUSH_NANINF
flag to eliminate this problem (note, however, that it might flush infinite numerical error to zero).
Click below to try Shaman online:
See our examples
folder for further illustrations of typical use cases.
To compile Shaman, open a shell and run :
cmake -DCMAKE_INSTALL_PREFIX=PREFIX .
make install
Where PREFIX
is the path to your desired instalation folder.
You can add the SHAMAN_ENABLE_TAGGED_ERROR
flag to enable tagged error (or the SHAMAN_TAGGED_ERROR
compilation flag if you use make).
To insure that cmake load Shaman, add find_package(shaman)
to the top of your CMakeLists.txt
file.
To link the library, add PUBLIC shaman::shaman
to the end of your target_link_libraries
line.
You might also need to set the shaman_DIR
variable (with the path to Shaman's autogenerated cmake files) if Shaman's instalation folder is not known to cmake.
Use the SHAMAN_UNSTABLE_BRANCH
flag to enable the count and detection of unstable branches.
The Shaman::displayUnstableBranches
function can then be used to print the number of unstable tests performed by the application (and additional localisation informations if tagged error is activated).
Don't forget to enable Fused-Multiply-Add at compilation (-mfma
). Shaman will keep functionning correctly without it but some operations (*
, /
, sqrt
) will be much slower.
This is the official C++ reference implementation.
You can find an official Julia implementation here (which does not include tagged error at the moment).
There is also an unofficial Haskell implementation currently part of the HGeometry library.
Shaman overloads operations to run a first order model of the propagation of numerical error in your code. Having both numbers and a good approximation of their numerical error, we can deduce their number of significant digits and signal unstable tests:
The numerical error produced by a single operation is deduced using an Error Free Transformation for arithmetic operators and higher precision arithmetic for arbitrary mathematical functions. Once the local numerical error has been computed, it is propagated in the rest of the computation using basic arithmetic.
It is important to note that Shaman follows the same control flow as the uninstrumented code. This means that, while Shaman can signal that higher precision code would have taken a different branch, it will follow the original branch. This can lead Shaman to underestimate the numerical stability of codes that have been designed to be resilient to numerical error in intermediate steps. Hence, when Shaman indicates that a result has few/no significant digits, you should always check wether it has detected unstable branches to confirm the result.
The inner workings of Shaman are detailed in Nestor Demeure's PhD(available here and here). You can reference it with:
@phdthesis{demeure_phd,
TITLE = {{Compromise between precision and performance in high performance computing.}},
AUTHOR = {Demeure, Nestor},
URL = {https://tel.archives-ouvertes.fr/tel-03116750},
SCHOOL = {{{\'E}cole Normale sup{\'e}rieure Paris-Saclay}},
YEAR = {2021},
MONTH = Jan,
TYPE = {Theses}
}
Footnotes
-
As numerical error is more likely to appears when large number of operations are in play. ↩