-
Notifications
You must be signed in to change notification settings - Fork 91
Implement common swizzle operations. #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
crates/core_simd/src/swizzle.rs
Outdated
| /// ``` | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn general_reverse<const SWAP_MASK: usize>(self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho this should include something like lanewise in it's name, since I expect Rust to gain a bitwise grev at some point and we'd want the bitwise simd and scalar integer operations to have matching names (probably gen_rev or generalized_reverse or grev?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For something that effectively implements a very specific xor-striding pattern, general_reverse is a non-descript name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, grev is what that particular bitwise op has become named, thanks to RISC-V afaict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grev -- general bit reverse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because an ISA has picked a terrible name doesn't mean we need to copy it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the precedent of u32::count_ones, it looks like the Rust standard library takes the convention of providing a more meaningful name (count_ones) while maintaining searchability for the common name popcnt using #[doc(alias = "popcnt")], https://github.com/rust-lang/rust/blob/f1b1ed7e18f1fbe5226a96626827c625985f8285/library/core/src/num/int_macros.rs#L104. I think such an approach could be warranted here too, keeping grev as an alias. Indeed, grev is a much less established term than popcnt, so the precedent from grev is weaker.
Other names I considered:
- butterfly_shuffle. The idea here is that, in the common case of SWAP_MASK being a power of 2, it implements one stage of a butterfly network. https://en.wikipedia.org/wiki/Butterfly_network
- swap_lanes_xor. The "swap lanes" part is pretty self-explanatory. The "xor" part is confusing, however: it suggests that the data bits are being xored, whereas it's actually the lane indices that are being xored.
Currently I lean towards swap_lanes_xor. Open to other suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swizzle_to_xor_indexes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse_pow2_lane_groups? Conceptually this operation performs the reversal of blocks of k lanes within n-lane groups, where k and n are both powers of two, k ≤ n ≤ LANES, and the choice of k and n is determined uniquely up to operation uniqueness by choosing where index 0 will be swizzled to (it’ll be exactly SWAP_MASK).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse_pow2_lane_groups?
afaict grev is actually more powerful than that, it can do any arbitrary combination of those k-n reversals for arbitrary k and n.
e.g. grev(v, 5) is equivalent to simd_swizzle!(v, [5, 4, 7, 6, 1, 0, 3, 2]) which is a combination of reversing adjacent pairs (blocks of 1) and swapping blocks of 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for the suggestions. Of the proposals so far, my preference order is:
swizzle_to_xor_indicesbutterfly_swizzlegrev
I have gone with swizzle_to_xor_indices. Let me know what you think.
crates/core_simd/src/swizzle.rs
Outdated
| /// Will be rejected at compile time if `LANES * 2 != DOUBLE_LANES`. | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn concat_to<const DOUBLE_LANES: usize>(self, other: Self) -> Simd<T, DOUBLE_LANES> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concat_to? Why not just concat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose concat_to because I imagine we'll want to eventually (when generic_const_exprs stabilizes) deprecate this function in favor of a function that uses generic_const_exprs. I wanted to reserve the better name concat for that new function, which would have signature:
fn concat(self, other: Self) -> Simd<T, {LANES * 2}>
The idea of the concat_to naming is that most call sites will look like x.concat_to::<8>(y) or similar, so it can be read as "concatenate x to length 8 using y".
Open to other suggestions. One option would be to call it concat now, and don't worry about introducing a breaking change in future once the generic_const_exprs approach works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh hm.
crates/core_simd/src/swizzle.rs
Outdated
| /// Will be rejected at compile time if `LANES * 2 != DOUBLE_LANES`. | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn concat_to<const DOUBLE_LANES: usize>(self, other: Self) -> Simd<T, DOUBLE_LANES> | ||
| where | ||
| LaneCount<DOUBLE_LANES>: SupportedLaneCount, | ||
| { | ||
| const fn concat_index<const DOUBLE_LANES: usize>(lanes: usize) -> [Which; DOUBLE_LANES] { | ||
| assert!(lanes * 2 == DOUBLE_LANES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking: Are you aware that when you put something in a const fn, and then the const fn is evaluated at runtime, that this means an assert! in it will not be evaluated at compilation time, but rather will be evaluated at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's only used at compile time, so that doesn't matter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, I see now, it happens in the associated constant. Then that requires monomorphization which puts it at risk of depending on trait and const evaluation details. Hmm. I do want to allow us to use certain post-monomorphization errors but I currently don't fully understand the evaluation patterns that rustc will do to intuit in what cases this will/will not happen.
There are cases where, due to various inputs you can give to the compiler, "dead" code can potentially get resurrected and monomorphized anyways (starting, of course, with -Clink-dead-code), so I want to better know whether this will trigger on those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guarantee: if you don't instantiate this generic at an invalid input/output type pair, then you won't get a post-monomorphization error. So this is like a type error, except with worse ergonomics because it's raised during monomorphization rather than type checking.
If we had generic_const_exprs we'd make this a genuine type error rather than monomorphization-time error, by giving this function the type concat_to(self, other: Self) -> Simd<T, {LANES * 2}>. I've created commit reinerp@1c2972b on a separate branch, to show how this would work. Unfortunately, the test doesn't compile due to what seems like a bug in generic_const_exprs.
More broadly, I would like to be able to use concat and split operations in my own code without requiring the generic_const_exprs language feature. Hence why I took my current approach for avoiding generic_const_exprs. From my perspective the generic_const_exprs feature is much less stable than portable_simd, and indeed portable_simd is the only unstable feature I allow in my own code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very helpful comments! I have switched from slice to a split API, and have made a few minor cleanups. Tests should pass now, I believe.
crates/core_simd/src/swizzle.rs
Outdated
| /// Will be rejected at compile time if `LANES * 2 != DOUBLE_LANES`. | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn concat_to<const DOUBLE_LANES: usize>(self, other: Self) -> Simd<T, DOUBLE_LANES> | ||
| where | ||
| LaneCount<DOUBLE_LANES>: SupportedLaneCount, | ||
| { | ||
| const fn concat_index<const DOUBLE_LANES: usize>(lanes: usize) -> [Which; DOUBLE_LANES] { | ||
| assert!(lanes * 2 == DOUBLE_LANES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guarantee: if you don't instantiate this generic at an invalid input/output type pair, then you won't get a post-monomorphization error. So this is like a type error, except with worse ergonomics because it's raised during monomorphization rather than type checking.
If we had generic_const_exprs we'd make this a genuine type error rather than monomorphization-time error, by giving this function the type concat_to(self, other: Self) -> Simd<T, {LANES * 2}>. I've created commit reinerp@1c2972b on a separate branch, to show how this would work. Unfortunately, the test doesn't compile due to what seems like a bug in generic_const_exprs.
More broadly, I would like to be able to use concat and split operations in my own code without requiring the generic_const_exprs language feature. Hence why I took my current approach for avoiding generic_const_exprs. From my perspective the generic_const_exprs feature is much less stable than portable_simd, and indeed portable_simd is the only unstable feature I allow in my own code.
crates/core_simd/src/swizzle.rs
Outdated
| /// Will be rejected at compile time if `LANES * 2 != DOUBLE_LANES`. | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn concat_to<const DOUBLE_LANES: usize>(self, other: Self) -> Simd<T, DOUBLE_LANES> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose concat_to because I imagine we'll want to eventually (when generic_const_exprs stabilizes) deprecate this function in favor of a function that uses generic_const_exprs. I wanted to reserve the better name concat for that new function, which would have signature:
fn concat(self, other: Self) -> Simd<T, {LANES * 2}>
The idea of the concat_to naming is that most call sites will look like x.concat_to::<8>(y) or similar, so it can be read as "concatenate x to length 8 using y".
Open to other suggestions. One option would be to call it concat now, and don't worry about introducing a breaking change in future once the generic_const_exprs approach works.
crates/core_simd/src/swizzle.rs
Outdated
| /// ``` | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn general_reverse<const SWAP_MASK: usize>(self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the precedent of u32::count_ones, it looks like the Rust standard library takes the convention of providing a more meaningful name (count_ones) while maintaining searchability for the common name popcnt using #[doc(alias = "popcnt")], https://github.com/rust-lang/rust/blob/f1b1ed7e18f1fbe5226a96626827c625985f8285/library/core/src/num/int_macros.rs#L104. I think such an approach could be warranted here too, keeping grev as an alias. Indeed, grev is a much less established term than popcnt, so the precedent from grev is weaker.
Other names I considered:
- butterfly_shuffle. The idea here is that, in the common case of SWAP_MASK being a power of 2, it implements one stage of a butterfly network. https://en.wikipedia.org/wiki/Butterfly_network
- swap_lanes_xor. The "swap lanes" part is pretty self-explanatory. The "xor" part is confusing, however: it suggests that the data bits are being xored, whereas it's actually the lane indices that are being xored.
Currently I lean towards swap_lanes_xor. Open to other suggestions!
Co-authored-by: Jacob Lifshay <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all comments are now addressed. Please take another look.
crates/core_simd/src/swizzle.rs
Outdated
| /// ``` | ||
| #[inline] | ||
| #[must_use = "method returns a new vector and does not mutate the original inputs"] | ||
| pub fn general_reverse<const SWAP_MASK: usize>(self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for the suggestions. Of the proposals so far, my preference order is:
swizzle_to_xor_indicesbutterfly_swizzlegrev
I have gone with swizzle_to_xor_indices. Let me know what you think.
|
Hi folks, I believe I have responded to all review comments so far. My sense is there are some areas where opinions differ between people on this thread on the best path forward, namely:
My sense is that this PR is stuck in review, pending consensus on these decisions. I'm not sure what process your project uses to make decisions in such cases. Could you please advise on the process? I am very excited about the |
|
I've definitely been neglecting this PR. Some thoughts:
|
concatandsliceaddress #277.general_reverseis useful for horizontal reductions, bitonic sorting, and many similar cases.