-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Additional float types #3451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional float types #3451
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,139 @@ | ||||||||||||||||||
| - Feature Name: `additional-float-types` | ||||||||||||||||||
| - Start Date: 2023-6-28 | ||||||||||||||||||
| - RFC PR: [rust-lang/rfcs#3451](https://github.com/rust-lang/rfcs/pull/3451) | ||||||||||||||||||
| - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||||||||||||||||||
|
|
||||||||||||||||||
| # Summary | ||||||||||||||||||
| [summary]: #summary | ||||||||||||||||||
|
|
||||||||||||||||||
| This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code. | ||||||||||||||||||
|
|
||||||||||||||||||
| # Motivation | ||||||||||||||||||
| [motivation]: #motivation | ||||||||||||||||||
|
|
||||||||||||||||||
| IEEE-754 standard defines binary floating point formats, including binary16, binary32, binary64 and binary128. The binary32 and binary64 correspond to `f32` and `f64` types in Rust, while binary16 and binary128 are used in multiple scenarios (machine learning, scientific computing, etc.) and accepted by some modern architectures (by software or hardware). | ||||||||||||||||||
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||||
|
|
||||||||||||||||||
| In C/C++ world, there're already types representing these formats, along with more legacy non-standard types specific to some platform. Introduce them in a limited way would help improve FFI against such code. | ||||||||||||||||||
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||||
|
|
||||||||||||||||||
| # Guide-level explanation | ||||||||||||||||||
| [guide-level-explanation]: #guide-level-explanation | ||||||||||||||||||
|
|
||||||||||||||||||
| `f16` and `f128` are primitive floating types, they can be used just like `f32` or `f64`. They always conform to binary16 and binary128 format defined in IEEE-754, which means size of `f16` is always 16-bit, and size of `f128` is always 128-bit. | ||||||||||||||||||
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||||
|
|
||||||||||||||||||
| ```rust | ||||||||||||||||||
| let val1 = 1.0; // Default type is still f64 | ||||||||||||||||||
| let val2: f128 = 1.0; | ||||||||||||||||||
| let val3: f16 = 1.0; | ||||||||||||||||||
| let val4 = 1.0f128; // Suffix of f128 literal | ||||||||||||||||||
| let val5 = 1.0f16; // Suffix of f16 literal | ||||||||||||||||||
|
|
||||||||||||||||||
| println!("Size of f128 in bytes: {}", std::mem::size_of_val(&val2)); // 16 | ||||||||||||||||||
| println!("Size of f16 in bytes: {}", std::mem::size_of_val(&val3)); // 2 | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| Because not every target supports `f16` and `f128`, compiler provides conditional guards for them: | ||||||||||||||||||
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||||
|
|
||||||||||||||||||
| ```rust | ||||||||||||||||||
| #[cfg(target_has_f128)] | ||||||||||||||||||
| fn get_f128() -> f128 { 1.0f128 } | ||||||||||||||||||
|
|
||||||||||||||||||
| #[cfg(target_has_f16)] | ||||||||||||||||||
| fn get_f16() -> f16 { 1.0f16 } | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards. | ||||||||||||||||||
|
|
||||||||||||||||||
| `f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. | ||||||||||||||||||
|
||||||||||||||||||
| `f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. | |
| The `f80` type is defined in `core::arch::{x86, x86_64}` as 80-bit extended precision. The `doubledouble` | |
| type is defined in `core::arch::{powerpc, powerpc64}` and represent's PowerPC's non-IEEE double-double | |
| format (two `f64`s used to aproximate `f128`). `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}` and represents the "brain" float, a truncated `f32` with SIMD support on some hardware. These | |
| types do not have literal representation. | |
| When working with FFI, the `core::ffi::c_longdouble` type can be used to match whatever type | |
| `long double` represents in C. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I did not add the mention of longdouble yet. More things need to be clarified:
- Is there always only 1 long double for each
(arch, abi, os)tuple? For example,powerpc64le-unknown-linux-gnucan use either double or doubledouble or IEEE binary128 aslong doubleby-mabi=(ieee|ibm)longdoubleand-mlong-double-(64|128). - Is mangling of
long doublethe same regardless of its underlying semantics? - Some targets (also
powerpc64lefor example) support.gnu_attribute, so that linker can differentiate objects compiled by different long double ABI. Should Rust programs usingc_longdoubleemit such attribute?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bf16 is supported on a wide range of newer architectures, such as powerpc, x86, arm, and (WIP) risc-v. imho it should not be classified as architecture-specific but instead more like f16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap bf16 should be simulated when target arch is not supported
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For people that have never heard of bf16 or doubledouble (which I assume are 16 and 128 bits in size, respectively), it would be good to link to some sort of document explaining them, and how they differ from f16 and f128, respectively.
Also the RFC needs to say what their semantics are, if IEEE doesn't specify them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nobody seems to agree on bf16 semantics:
arm has both round as normal with subnormals supported and round to odd with subnormals not supported.
x86 has round to nearest with subnormals not supported.
powerpc has round as normal with subnormals supported.
all isas have round towards zero with subnormals supported (just f32::to_bits(v) >> 16).
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
ecnelises marked this conversation as resolved.
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`. | |
| The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`. | |
| `x86_64-*` and `aarch64-*` |
I don't know for sure what targets support it, but we should aim to at least support the major 64-bit CPUs
here at first
There is also a risc target per @aaronfranke here #2629 (comment) but I'm not sure how rv64gQc maps to our riscv64gc
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
ecnelises marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not every platform supports f32 and f64 natively either. For example, RISC-V without the F or D extensions (ex: ISA string of rv64i). This should be mentioned.
Whatever emulation Rust already does to support f32 and f64 on systems without native support should similarly happen to emulate f128 on systems without native quadruple-precision support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For riscv without hardware float support there is a defined soft-float ABI. There is not for f16/bf16. Same for x86_64. Many other architectures likely don't have a defined soft-float abi for f128 either. And as I understand it AArch64 doesn't have a soft-float abi at all as Neon support is mandatory and floats are even allowed inside the kernel unlike eg x86_64 where floats are disabled in the kernel to avoid having to save them on syscalls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest use these symbols, all start with
fprefix that consistencef128: C_Float128, LLVMfp128, GCC__float128f16: C_Float16, LLVMhalf, GCC__fp16f16b: C++std::bfloat16_t, LLVMbfloat, GCC__bf16f80e: LLVMx86_fp80, GCC__float80f64f64: LLVMppc_fp128, GCC__ibm128doubledouble, or maybef64f64emeans not standardAnd these symbols can be used as literal suffix as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for
f16bin favor overbf16for consistency, I liked that about @joshtriplett's original proposal.I don't think we should introduce something like
f128x-doubledoubleor something like the GCC or LLVM types would be better IMO. Reason being, it's kind of ambiguous and specific to one architecture - PowerPC is even moving away from it. Better to give it an unambigous name since it will be used relatively rarely.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should use
bf16rather thanf16bsince that is widely recognized whereasf16bisn't, andf64_f64instead off128xsince it really is 2f64values and could be easily emulated on any other architecture (do not usef64x2since that's already used bySimd<f64, 2>). alsof<N>xnames are more or less defined by IEEE 754 to be wider thanNbits, so e.g.f64xwould be approximately any type wider thanf64but less thanf128such asf80,f16xcould be thef24type used by some GPUs for depth buffers. so logicallyf80xwould need to be more than 80 bits andf128xwould need to be more than 128 bits.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f64f64is OK, no need the underscore looks likedoubledouble,f80einsteadf80xiff80xis not acceptableStill vote for
f16b, It's rust specific, we can create relationship betweenbf16andf16bin rust, that's won't be a burden.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What aboutf64x2to indicate it's two f64 glued together?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use f64x2 since that's already used by Simd<f64, 2>