Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand and add more CUDA/HIP documentation #1309

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

msimberg
Copy link
Contributor

@msimberg msimberg commented Nov 1, 2024

Adds documentation for classes that weren't yet documented and updates the existing documentation. Adds a section for pika/cuda.hpp to the API documentation.

Early stages, far from complete, but may require some discussion before continuing.

@msimberg msimberg self-assigned this Nov 1, 2024
Copy link

codacy-production bot commented Nov 1, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.01% (target: -1.00%) (target: 90.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (2df06f7) 18285 13814 75.55%
Head commit (0ed5c9d) 18274 (-11) 13803 (-11) 75.53% (-0.01%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#1309) 0 0 ∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

docs/api.rst Show resolved Hide resolved
docs/api.rst Outdated Show resolved Hide resolved
docs/api.rst Outdated Show resolved Hide resolved
docs/api.rst Outdated Show resolved Hide resolved
:language: c++
:start-at: #include

.. doxygenvariable:: pika::cuda::experimental::then_with_stream
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit weirdly documented at the moment, since we document the variable, but not the call operators of the then_with_stream_t class. I'm between having an example, which is usually clear enough on its own, and just documenting the call operators explicitly. Currently I refer e.g. to \p f as the callable passed to the adaptor, but the user has no information on how that is passed to the adaptor (not much of a problem for then_with_stream but a bigger problem for then_with_cublas since that also takes a pointer mode).

Should I include both an example and document the call operators? If we document the call operators should we do the same for drop_value, drop_operation_state, etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explore separately from this PR what documentation would look like with e.g. documentation for call operator on CPO type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up trying to document the types of the CPOs as well, and ended up with something that looks like this:

2024-11-07-153409_971x952_scrot

i.e. first type, then CPO itself, and then the example. I tried to keep it quite short, referring to the CPO documentation for more details. The important part for me was to get the signature of the CPO visible, so I've documented the two operator()s and their parameters.

@albestro, @aurianer, @rasolca, @RMeli thoughts?

docs/api.rst Show resolved Hide resolved
@msimberg msimberg force-pushed the cuda-docs branch 5 times, most recently from 37b7898 to afec1e1 Compare November 7, 2024 14:41
@msimberg
Copy link
Contributor Author

msimberg commented Nov 7, 2024

I welcome more feedback on this PR now. I feel like some of the examples I've created are a bit forced/artificial in the name of showing off what's possible with the adaptors, so if you have suggestions on how to improve change them that's particularly welcome. I've also now documented the types of the CUDA/HIP sender adaptors so that we can document the parameters of them. I think it looks decent enough, but interested in hearing feedback from others.

@msimberg msimberg marked this pull request as ready for review November 7, 2024 14:46
@msimberg msimberg force-pushed the cuda-docs branch 5 times, most recently from 54cc672 to 1a9dd5f Compare November 8, 2024 12:23
docs/api.rst Outdated Show resolved Hide resolved
docs/api.rst Outdated Show resolved Hide resolved
examples/documentation/then_with_cublas_documentation.cu Outdated Show resolved Hide resolved
Comment on lines +170 to +177
/// \param device the CUDA device used for scheduling work
/// \param num_normal_priority_streams_per_thread the number of normal priority streams per
/// thread
/// \param num_high_priority_streams_per_thread the number of high priority streams per
/// thread
/// \param flags flags used to construct CUDA streams
/// \param num_cublas_handles the number of cuBLAS handles to create for the whole pool
/// \param num_cusolver_handles the number of cuSOLVER handles to create for the whole pool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a short comment on how the default values have been selected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try 🙃 They are "arbitrary"/decent average value that seem to not perform too badly with DLA-Future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be enough. It's good for people interested in pika but without any knowledge of DLA-Future to know these values are based on a production project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the documentation for the parameters in f267193. I'm anticipating #1294 with the documentation here. I'm still documenting the per-thread parameters here, but I'd like to get #1294 merged first.

Do you find the warnings sufficient? We're essentially making a silent API change with #1294, and I'm worried that it's too silent (especially if one doesn't read documentation or release notes)... The main alternative I can think of is to make the parameters strong types instead of plain integers, thus requiring a explicit conversion at the callsite.

Copy link
Contributor

@RMeli RMeli Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new documentation looks great, thank you.

The silent change is quite disruptive indeed, and I see why you are worried it is a bit too silent. However, I think it is probably enough combined with the release notes. Mostly because pika is still in 0.x and semantic version explicitly states that anything can change, and I don't see many issues from external users in the repository (so usage outside of DLA-Future might still be reasonably low).

The main alternative I can think of is to make the parameters strong types instead of plain integers, thus requiring a explicit conversion at the callsite.

I did not look at #1294, but another alternative would be to have a flag/option allowing to switch between old and new behavior, and have the old behavior by default together with a deprecation warning. However, it might be a PITA to implement and too overkill for the reasons above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks, good to hear you're not too concerned. This kind of stuff is indeed why we've stuck to 0.X and semver, so I suppose I should exercise that right... it's just an uneasy feeling knowing that everything will compile just fine, but possibly perform differently.

too overkill for the reasons above.

Yeah, I'd like to avoid that if possible.

However, I am planning on having a warning about this in DLA-Future (and basically introducing a new configuration option), so there the change should definitely not be silent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants