-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand and add more CUDA/HIP documentation #1309
base: main
Are you sure you want to change the base?
Conversation
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
:language: c++ | ||
:start-at: #include | ||
|
||
.. doxygenvariable:: pika::cuda::experimental::then_with_stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a bit weirdly documented at the moment, since we document the variable, but not the call operators of the then_with_stream_t
class. I'm between having an example, which is usually clear enough on its own, and just documenting the call operators explicitly. Currently I refer e.g. to \p f
as the callable passed to the adaptor, but the user has no information on how that is passed to the adaptor (not much of a problem for then_with_stream
but a bigger problem for then_with_cublas
since that also takes a pointer mode).
Should I include both an example and document the call operators? If we document the call operators should we do the same for drop_value
, drop_operation_state
, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explore separately from this PR what documentation would look like with e.g. documentation for call operator on CPO type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up trying to document the types of the CPOs as well, and ended up with something that looks like this:
i.e. first type, then CPO itself, and then the example. I tried to keep it quite short, referring to the CPO documentation for more details. The important part for me was to get the signature of the CPO visible, so I've documented the two operator()
s and their parameters.
libs/pika/async_cuda_base/include/pika/async_cuda_base/cublas_handle.hpp
Show resolved
Hide resolved
libs/pika/async_cuda/include/pika/async_cuda/then_with_stream.hpp
Outdated
Show resolved
Hide resolved
libs/pika/async_cuda_base/include/pika/async_cuda_base/cuda_stream.hpp
Outdated
Show resolved
Hide resolved
37b7898
to
afec1e1
Compare
I welcome more feedback on this PR now. I feel like some of the examples I've created are a bit forced/artificial in the name of showing off what's possible with the adaptors, so if you have suggestions on how to improve change them that's particularly welcome. I've also now documented the types of the CUDA/HIP sender adaptors so that we can document the parameters of them. I think it looks decent enough, but interested in hearing feedback from others. |
54cc672
to
1a9dd5f
Compare
libs/pika/async_cuda/include/pika/async_cuda/then_with_stream.hpp
Outdated
Show resolved
Hide resolved
libs/pika/async_cuda_base/include/pika/async_cuda_base/cublas_handle.hpp
Outdated
Show resolved
Hide resolved
libs/pika/async_cuda_base/include/pika/async_cuda_base/cusolver_handle.hpp
Outdated
Show resolved
Hide resolved
libs/pika/async_cuda/include/pika/async_cuda/then_with_stream.hpp
Outdated
Show resolved
Hide resolved
/// \param device the CUDA device used for scheduling work | ||
/// \param num_normal_priority_streams_per_thread the number of normal priority streams per | ||
/// thread | ||
/// \param num_high_priority_streams_per_thread the number of high priority streams per | ||
/// thread | ||
/// \param flags flags used to construct CUDA streams | ||
/// \param num_cublas_handles the number of cuBLAS handles to create for the whole pool | ||
/// \param num_cusolver_handles the number of cuSOLVER handles to create for the whole pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a short comment on how the default values have been selected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try 🙃 They are "arbitrary"/decent average value that seem to not perform too badly with DLA-Future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be enough. It's good for people interested in pika but without any knowledge of DLA-Future to know these values are based on a production project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the documentation for the parameters in f267193. I'm anticipating #1294 with the documentation here. I'm still documenting the per-thread parameters here, but I'd like to get #1294 merged first.
Do you find the warnings sufficient? We're essentially making a silent API change with #1294, and I'm worried that it's too silent (especially if one doesn't read documentation or release notes)... The main alternative I can think of is to make the parameters strong types instead of plain integers, thus requiring a explicit conversion at the callsite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new documentation looks great, thank you.
The silent change is quite disruptive indeed, and I see why you are worried it is a bit too silent. However, I think it is probably enough combined with the release notes. Mostly because pika
is still in 0.x
and semantic version explicitly states that anything can change, and I don't see many issues from external users in the repository (so usage outside of DLA-Future might still be reasonably low).
The main alternative I can think of is to make the parameters strong types instead of plain integers, thus requiring a explicit conversion at the callsite.
I did not look at #1294, but another alternative would be to have a flag/option allowing to switch between old and new behavior, and have the old behavior by default together with a deprecation warning. However, it might be a PITA to implement and too overkill for the reasons above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks, good to hear you're not too concerned. This kind of stuff is indeed why we've stuck to 0.X and semver, so I suppose I should exercise that right... it's just an uneasy feeling knowing that everything will compile just fine, but possibly perform differently.
too overkill for the reasons above.
Yeah, I'd like to avoid that if possible.
However, I am planning on having a warning about this in DLA-Future (and basically introducing a new configuration option), so there the change should definitely not be silent.
libs/pika/async_cuda_base/include/pika/async_cuda_base/cuda_stream.hpp
Outdated
Show resolved
Hide resolved
libs/pika/async_cuda_base/include/pika/async_cuda_base/cuda_stream.hpp
Outdated
Show resolved
Hide resolved
Document cuda_pool, cuda_scheduler, cuda_stream, cublas_handle, cusolver_handle, as well as expose these with CUDA sender adaptors in the documentation. Adds a high-level example of using CUDA functionality.
Adds documentation for classes that weren't yet documented and updates the existing documentation. Adds a section for
pika/cuda.hpp
to the API documentation.Early stages, far from complete, but may require some discussion before continuing.