-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topic refactor cuda hip atomics #1768
Topic refactor cuda hip atomics #1768
Conversation
90f6ed4
to
6ed752b
Compare
// | ||
// Atomics can be used in the hierarchy level grids, blocks and threads. | ||
// Atomics are not guaranteed to be save between devices. | ||
class AtomicUniformCudaHipBuiltIn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also works, if the CUDA and the HIP are enabled at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I know. You can not enable CUDA and HIP at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really, I thought Andrea implemented it? Or is it permitted by the CMake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I know he is compiling a library for CUDA and a library for HIP and linking both together.
Both libs will use a different compiler.
@fwyzard Could you check if this PR is working for you. I do not exactly know your workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I remember that we introduced the device API struct to avoid naming conflicts. Nevertheless, these changes are fully on the device side and should not create symbols on the host side which can conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I have been doing night shifts this week, and had little time for the rest :-/
I'll try to have a look ASAP.
6ed752b
to
d7e586b
Compare
bb5870b
to
da30e6c
Compare
92eba90
to
c400aca
Compare
- check for builtin atomics and use these if available - emulate atomics which are not provided as builtin via atomicCas - emulate atomicCas for floating point data
Unify CPU and HIP/CUDA atomic tests. For CUDA run tests only for 32bit and 64bit data types.
``` alpaka/atomic/AtomicOaccExtended.hpp:209:36: error: comparing floating-point with '==' or '!=' is unsafe [-Werror=float-equal] ```
For architectures where __hip_atomic_compare_exchange_strong is false HIP is emulating atomics with `atomicCAS` but with a broken implementation.
c400aca
to
eb579fd
Compare
In #1767 I already enabled the usage of building atomicAdd for double precision.
Nevertheless, we are emulating in alpaka often atomics because it is a nightmare to maintain the list of supported atomic functions for HIP and CUDA. The list of supported types for different atomics differs between both models.
This PR introduces a "simple" way to let the compiler decide if a built-in function is available and use these. If not we will emulate the required atomics by using
atomicCas
.hierarchy::Threads
-> mapped toatomicXXX_block()