-
Notifications
You must be signed in to change notification settings - Fork 320
Fixes for shuffle_iterator #7130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
| { | ||
| const uint64_t __total_bits = static_cast<uint64_t>(::cuda::std::max(4, ::cuda::std::bit_width(__num_elements))); | ||
| const uint64_t __total_bits = | ||
| (::cuda::std::max) (uint64_t{8}, static_cast<uint64_t>(::cuda::std::bit_width(__num_elements))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (::cuda::std::max) (uint64_t{8}, static_cast<uint64_t>(::cuda::std::bit_width(__num_elements))); | |
| ::cuda::std::max(uint64_t{8}, static_cast<uint64_t>(::cuda::std::bit_width(__num_elements))); |
Should be fine now :) you can leave it as is
libcudacxx/test/libcudacxx/cuda/iterators/shuffle_iterator/uniformity.pass.cpp
Outdated
Show resolved
Hide resolved
| // Mitchell, Rory, et al. "Bandwidth-optimal random shuffling for GPUs." ACM Transactions on Parallel Computing 9.1 | ||
| // (2022): 1-20. | ||
| uint32_t __L = __val >> __R_bits_; | ||
| uint32_t __R = __val & __R_mask_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Is this part the actual fix, that we swap the definitions of the original high and low? Otherwise the rest looks the same to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there was a significant bug. The high and low values were being initialised swapped from what they should have been. This meant that with odd number of bits one of the bits was getting removed - as a consequence the tests would run forever because the bijection was incorrect (i.e. values collided).
I wrote it from scratch again to find the bug as I couldn't find it at first! I relabeled everything with the notation from the paper which is less confusing to me.
|
pre-commit.ci autofix |
|
One thing I would also like to try in a subsequent PR is using a splitmix64 (or similar PRNG sequence) inside the round function to generate the key sequence from a 64 bit starting key instead of carrying around 24 registers with the PRNG keys. The philox PRNG just uses a weyl sequence but according to my tests it is not random enough. |
|
Further note: number of rounds is set to 24. Anything less than 12 starts to fail these tests. So 24 is probably a good conservative value for now. |
|
/ok to test 013af58 |
🥳 CI Workflow Results🟩 Finished in 59m 28s: Pass: 100%/84 | Total: 11h 33m | Max: 34m 31s | Hits: 99%/197991See results here. |
* Fix feistel bijection * Add loads of tests * Review comments --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> (cherry picked from commit 762e20e)
|
Successfully created backport PR for |
* Fix feistel bijection * Add loads of tests * Review comments --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> (cherry picked from commit 762e20e)
…#7147) * Clean up hierarchy (#7023) (cherry picked from commit d3db7e0) * Make c2h vector comparisons `constexpr` (#7009) (cherry picked from commit 4215bd8) * Disable cudax with msvc in CI for now (#7139) (cherry picked from commit 7ed26fc) * libcu++: silence msvc+nvcc12.9 warning plaguing c.parallel. (#7144) (cherry picked from commit b98e950) * Fixes for shuffle_iterator (#7130) * Fix feistel bijection * Add loads of tests * Review comments --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> (cherry picked from commit 762e20e) * Fix uniformity test --------- Co-authored-by: pciolkosz <[email protected]> Co-authored-by: Michał Dominiak <[email protected]> Co-authored-by: Rory Mitchell <[email protected]> Co-authored-by: Michael Schellenberger Costa <[email protected]>
Fixes the issues described in #7062 (comment) and adds many more tests.