Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Rotation Instruction Inference #87

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rnvannatta
Copy link

@rnvannatta rnvannatta commented Oct 23, 2023

Hello,

I have a small fix for the rotation instructions that the project has a bit of inline asm for. An additional mask causes both gcc as far back as 4.9.0 and clang as far back as 11.0 to successfully infer a rotation and emit a rotation instruction.

Here is a demonstration of the changelist: https://godbolt.org/z/PbMz4xcbj

I left the ASM in and merely updated the comment, though it might be worth considering deleting the ASM, as that's harder for the compiler to optimize around and with this mask the rotation instructions infer well even on somewhat older compiler versions.

The reason for the mask's necessity I believe is essentially a result of the usual arithmetic conversions: rotating a uint8 by 16 loops around twice but the rotl function before the changelist doesn't account for them. (uint8)i << (uint8)k first casts i and k to ints which means that undefined behavior does not apply for the uint8 and uint16 cases. So, the uint8 overload produced jibberish for rotations between 8 and 31 inclusively. But defined jibberish. However, uint32 and uint64 were able to emit rot instructions without the mask because >100% wraparound rotations invoked UB in the code. Fun and wacky nasal demons.

For your convenience, I also have a copy of this PR for the C version of this library: imneme/pcg-c#34

Thanks,
Richard

Fixes the rotation functions so they infer to a rotation instruction without the usage of inline asm on gcc versions 4.9.0 and newer, plus clang versions 11.0 and newer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant