Skip to content

v0.8.1 (Jan 9, 2020)

Compare
Choose a tag to compare
@x448 x448 released this 10 Jan 00:00
· 106 commits to master since this release

Changes include:

  • Add PrecisionUnknown as a return value of PrecisionFromfloat32. Number of possible return values was 4 and is now 5.
// Precision indicates whether the conversion to Float16 is
// exact, subnormal without dropped bits, inexact, underflow, or overflow.
type Precision int

const (

	// PrecisionExact is for non-subnormals that don't drop bits during conversion.
	// All of these can round-trip float32->float16->float32.
	PrecisionExact Precision = iota

	// PrecisionUnknown is for subnormals that don't drop bits during conversion but
	// not all of these can round-trip so precision is unknown without more effort.
	// Only 2046 of these can round-trip and the rest cannot round-trip.
	PrecisionUnknown

	// PrecisionInexact is for dropped significand bits and cannot round-trip.
	// Some of these are subnormals. 
        // Cannot round-trip float32->float16->float32.
	PrecisionInexact

	// PrecisionUnderflow is for Underflows.
        // Cannot round-trip float32->float16->float32.
	PrecisionUnderflow

	// PrecisionOverflow is for Overflows.
        // Cannot round-trip float32->float16->float32.
	PrecisionOverflow
)

// PrecisionFromfloat32 returns Precision without performing the conversion.
// Conversions from both Infinity and NaN values will always report PrecisionExact
// even if NaN payload is lost or NaN quiet-bit is changed. This function is kept simple 
// to allow inlining and run < 0.5 ns/op, to serve as a fast filter.
func PrecisionFromfloat32(f32 float32) Precision