Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
I have a small fix for the rotation instructions that the project has a bit of inline asm for. An additional mask causes both gcc as far back as 4.9.0 and clang as far back as 11.0 to successfully infer a rotation and emit a rotation instruction.
Here is a demonstration of the changelist: https://godbolt.org/z/PbMz4xcbj
I left the ASM in and merely updated the comment, though it might be worth considering deleting the ASM, as that's harder for the compiler to optimize around and with this mask the rotation instructions infer well even on somewhat older compiler versions.
The reason for the mask's necessity I believe is essentially a result of the usual arithmetic conversions: rotating a uint8 by 16 loops around twice but the rotl function before the changelist doesn't account for them. (uint8)i << (uint8)k first casts i and k to ints which means that undefined behavior does not apply for the uint8 and uint16 cases. So, the uint8 overload produced jibberish for rotations between 8 and 31 inclusively. But defined jibberish. However, uint32 and uint64 were able to emit rot instructions without the mask because >100% wraparound rotations invoked UB in the code. Fun and wacky nasal demons.
For your convenience, I also have a copy of this PR for the C version of this library: imneme/pcg-c#34
Thanks,
Richard