-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dimensions description functionality to CUDA Experimental library #1743
Conversation
There are several changes from the previous iteration: dims and flatten was replaced with extents, which should fit well to what the function does and to mdspan::extents(). Some initial inline rst documentation was added, but its not being build yet. Header guards were adjusted to the new library name. #undef NDEBUG was added to testing_common.h to properly enable assertions in device side testing.
Disable c++17 windows builds because mdspan is not supported there
Also adds a comment describing a new test of rst docs examples
pre-commit.ci autofix |
/ok to test |
🟨 CI Results [ Failed: 35 | Passed: 328 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
/ok to test |
🟩 CI Results [ Failed: 0 | Passed: 363 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
cudax/include/cuda/experimental/detail/hierarchy_dimensions.cuh
Outdated
Show resolved
Hide resolved
cudax/include/cuda/experimental/detail/hierarchy_dimensions.cuh
Outdated
Show resolved
Hide resolved
cudax/include/cuda/experimental/detail/hierarchy_dimensions.cuh
Outdated
Show resolved
Hide resolved
cudax/include/cuda/experimental/detail/hierarchy_dimensions.cuh
Outdated
Show resolved
Hide resolved
Cleaned up order of annotations and constexpr, removed __device__. Moved to absolute includes. Added no discard on all functions in detail namespace Added c++17 ifdef. Changed header guards to a new format applicable after I move some files in a future change. A couple of _LIBCUDACXX_UNREACHABLE and other fixes
and the main header to hierarchy.cuh
/ok to test |
🟨 CI Results [ Failed: 36 | Passed: 327 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
/ok to test |
🟨 CI Results [ Failed: 32 | Passed: 331 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
/ok to test |
🟩 CI Results [ Failed: 0 | Passed: 363 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
/ok to test |
🟩 CI Results [ Failed: 0 | Passed: 363 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
This change adds static_cast to product_type of a given level on extents operations, instead of the currently only passing it as extents template argument. This change also adds a test case for the above issue
/ok to test |
🟨 CI Results [ Failed: 4 | Passed: 359 | Total: 363 ]
|
# | Runner |
---|---|
275 | linux-amd64-cpu16 |
40 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
20 | windows-amd64-cpu16 |
👃 Inspect Changes
Modifications in project?
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
/ok to test |
- {jobs: ['build'], project: 'cudax', ctk: [*ctk_12_0, *ctk_curr], std: 'all', cxx: [*gcc9, *gcc10, *gcc11]} | ||
- {jobs: ['build'], project: 'cudax', ctk: [*ctk_12_0, *ctk_curr], std: 'all', cxx: [*llvm9, *llvm10, *llvm11, *llvm12, *llvm13, *llvm14]} | ||
- {jobs: ['build'], project: 'cudax', ctk: [ *ctk_curr], std: 'all', cxx: [*llvm15]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that still valid? I would guess that std: 'all'
will also test C++11 / C++14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this seems wrong, but then why are there only c++17/20 jobs running for this PR?
🟨 CI finished in 4h 00m: Pass: 99%/420 | Total: 2d 06h | Avg: 7m 43s | Max: 1h 23m | Hits: 96%/521783
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
🏃 Runner counts (total jobs: 420)
# | Runner |
---|---|
305 | linux-amd64-cpu16 |
64 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
23 | windows-amd64-cpu16 |
🟩 CI finished in 17h 04m: Pass: 100%/420 | Total: 2d 07h | Avg: 7m 52s | Max: 1h 23m | Hits: 96%/522634
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
🏃 Runner counts (total jobs: 420)
# | Runner |
---|---|
305 | linux-amd64-cpu16 |
64 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
23 | windows-amd64-cpu16 |
This pull request adds hierarchy_dimensions type template that allows to describe a hierarchy of CUDA threads with a mix of static and dynamic information. It can be used to pass into kernels and then calculate aggregates at compile time (like count threads in each CUDA block to create a statically sized array). It can also be used to make libraries aware of the shape of currently running grid and optimize some thread id calculation with compile time values.
hierarchy_dimensions type template is basically a tuple of level_dimensions entries that describe levels. Each level consist of two things, first one is a type to describe what that level it is, for example block_level, cluster_level, etc. Second one is cuda::std::extents object to describe dimensions of that level with both static and dynamic values, the same way it describes cuda::std::mdspan objects.
This is an initial implementation, there is a number of TODOs, names are not final and the interface can change depending on the feedback received.
This type is is also a building block for other libraries and features that are in the pipeline.
This functionality was initially a part of the PR that added CUDA Experimental, but was separated for easier review.
Compared to the previous pull request:
NVBUG 4541889