-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An example using cuBLAS library in alpaka #2430
An example using cuBLAS library in alpaka #2430
Conversation
ae8e4ed
to
a0dee21
Compare
N, | ||
K, // Dimensions: C = A * B | ||
&alpha, | ||
alpaka::getPtrNative(bufDevA), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better use std::data()
instead of alpaka::getPtrNative()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done, thanks.
Idx const K = 3; // Columns in A and rows in B | ||
|
||
// Define device and queue | ||
using Acc = alpaka::AccGpuCudaRt<Dim1D, Idx>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please use the CUDA tag and derive the ACC from the tag? THis will reduce the work as soon as we refactor the accelerators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the standard tags like other examples, but prevented the configuration of this example at cmake if ACC_CUDA_ONLY cmake variable is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway i used cuda tag as you suggested. This example could have a direct main rather than using ExampleTags since only will run with single backend.
using Acc = alpaka::TagToAcc<alpaka::TagGpuCudaRt, Dim1D, Idx>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@psychocoderHPC I agree with @mehmetyusufoglu . It does not make sense to use the same template, like in the other examples. The code can be only used with the CUDA backend. Therefore we need no complicated iteration over the enabled tags.
07acce5
to
2b3ef4f
Compare
2b3ef4f
to
f920100
Compare
4f98208
to
3194bb3
Compare
3194bb3
to
4d2bd86
Compare
4d2bd86
to
c2dee1c
Compare
This PR is closed because the changes is added to #2433 since 2 PR's will share the same directory in |
This example uses cuBLAS library for matrix multiplication by using allocated alpaka buffers and alpaka queue. Another example is using rocBLAS library.
Cmake file is still to be changed for CI to not fail for other backends.