Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate CUDASTF -> CudaX #2572

Merged
merged 364 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
364 commits
Select commit Hold shift + click to select a range
ad60e3b
use the proper source_location everywhere
caugonnet Oct 15, 2024
cbdcd6a
ensure the == operator on thread_hierarchy is not applied on the same…
caugonnet Oct 15, 2024
7ebd502
Try to avoid ambiguous operators
caugonnet Oct 15, 2024
e73e5fa
add a static assertion to detect invalid operator matches
caugonnet Oct 15, 2024
ca4a389
Erroneous assertions
caugonnet Oct 15, 2024
26e333c
Try to avoid ambiguous operators
caugonnet Oct 15, 2024
fe864ca
Try to avoid ambiguous operators
caugonnet Oct 15, 2024
96e0864
Fix ambiguities
miscco Oct 15, 2024
c40076b
Fix more
miscco Oct 15, 2024
5efbe2e
Revert "Try to avoid ambiguous operators"
miscco Oct 15, 2024
32500bf
Fix comparisons for good
miscco Oct 15, 2024
ee6170a
Try to avoid CTAD issues with older clang
miscco Oct 15, 2024
52b3d89
avoid warning from gcc
miscco Oct 15, 2024
d8ea187
use CCCL macros
miscco Oct 15, 2024
d078088
Do not use open `__host__` or `__device__`
miscco Oct 15, 2024
0c3c7db
Try and move calulation out of template header
miscco Oct 15, 2024
7a09a95
Do not use MSVC macro
miscco Oct 15, 2024
caacad1
No source location for gcc11
miscco Oct 15, 2024
4d0b5d2
Fix the example project which the documentation refers too (#2531)
caugonnet Oct 10, 2024
d2465da
Enable tests/headertests for c/parallel in all-dev presets. (#2566)
alliepiper Oct 11, 2024
88cb550
Rename cudax test targets to match CCCL conventions. (#2568)
alliepiper Oct 12, 2024
0c7ac3f
Update project list in issue template (#2532)
alliepiper Oct 12, 2024
b714287
Disable compiler extensions on CCCL targets. (#2559)
alliepiper Oct 14, 2024
4cf6604
fixes batched memcpy to support ptr-to-const buffer sources (#2573)
elstehle Oct 14, 2024
31ab4bd
Fix an error in the documentation of the ci/build_common.sh script wh…
caugonnet Oct 14, 2024
652f5ec
gcc-14 gained the ability to mangle `noexcept` expressions (#2565)
ericniebler Oct 14, 2024
f5bc208
Miscellaneous simple fixes (#2575)
rwgk Oct 14, 2024
a7e7ffd
Avoid including `yvals.h` when the compiler is not MSVC. (#2545)
wmaxey Oct 14, 2024
ac1e957
Fix popc.h when architecture is not x86 on MSVC. (#2524)
wmaxey Oct 14, 2024
8c7fb97
test for exceptions support on msvc with the `_CPPUNWIND` macro (#2576)
ericniebler Oct 14, 2024
140ad31
fix the forwarding of the receiver in the `just_from` algorithm (#2569)
ericniebler Oct 14, 2024
aeee99e
Block type pack indexing on NVCC (#2563)
wmaxey Oct 14, 2024
0bb59e5
Cleanup the semaphore headers (#2441)
miscco Oct 15, 2024
e211869
Add `_CCCL_GRID_CONSTANT` macro (#2530)
fbusato Oct 15, 2024
9a110f2
Add `_CCCL_RESTRICT` macro (#2529)
fbusato Oct 15, 2024
14ce44a
Try to use the same redefinition of `__assert_fail` as pytorch has (#…
miscco Oct 15, 2024
dca8e39
Fix miscellaneous bugs in documentation. (#2580)
rwgk Oct 15, 2024
e10c75d
Expose parts of `<cuda/std/memory>` (#2502)
fbusato Oct 15, 2024
134f087
add a config macro for testing support for inline variables (#2581)
ericniebler Oct 15, 2024
f9c3a47
add missing header
caugonnet Oct 15, 2024
ba4dcf2
add missing header for constants
caugonnet Oct 15, 2024
6d0e4a4
Revert "use CCCL macros"
caugonnet Oct 15, 2024
8814d3a
Only enable UNITTESTS with ARGS when we have C++20 because it needs _…
caugonnet Oct 15, 2024
3ad95c2
fix definition of STF_HAS_UNITTEST_WITH_ARGS
caugonnet Oct 15, 2024
da42da8
fix fallback macro
caugonnet Oct 15, 2024
d41d4bb
disable more unit tests which have params
caugonnet Oct 15, 2024
ece9600
add missing template keywords
caugonnet Oct 15, 2024
4d6e6fd
Add config include and header guard
miscco Oct 16, 2024
122641f
disable experiment needing pthreads for now
Oct 16, 2024
8f95cc2
Remove reference qualifier on Fun to have a non reference passed to t…
caugonnet Oct 16, 2024
110021d
Use `ptrdiff_t` instead of `ssize_t`
miscco Oct 16, 2024
5f55337
Add missing include
miscco Oct 16, 2024
4a7707c
MSVC needs a confirming preprocessor
miscco Oct 16, 2024
dfd17a0
Properly use begin / end on vector
miscco Oct 16, 2024
44ac3d0
use `int` for the `dev_id` in `stream_pool`
miscco Oct 16, 2024
8e50703
Fix some conversion warnings
miscco Oct 16, 2024
a73a19e
Allow exception to be copied
miscco Oct 16, 2024
714ca04
Avoid shadowing warning
miscco Oct 16, 2024
a558cec
MSVC doesnt like decltype in a template header
miscco Oct 16, 2024
9c2f964
Use the right fprintf operator
miscco Oct 16, 2024
81a77c4
More conversion
miscco Oct 16, 2024
f949dd8
Avoid shadowing
miscco Oct 16, 2024
75f6083
More conversions
miscco Oct 16, 2024
65f96d3
use begin end of vector
miscco Oct 16, 2024
2d98d5b
Avoid the identifier `small` on windows
miscco Oct 16, 2024
bbe194e
Use alias for base class
miscco Oct 16, 2024
23a79e7
Avoid callback_queues on windows
miscco Oct 16, 2024
1b79f13
More conversion issues
miscco Oct 16, 2024
0903c8e
Drop unneeded template keyword
miscco Oct 16, 2024
23e64e8
Add missing include
miscco Oct 16, 2024
3558aa9
fix more conversions
miscco Oct 16, 2024
bb1860f
Try to make MSVC see that we are inheriting
miscco Oct 16, 2024
1f0ef9f
Disable test for windows
miscco Oct 16, 2024
91d139f
Avoid unreachable warnings
miscco Oct 16, 2024
27c8da0
Avoid uninitialized variable warnigns
miscco Oct 16, 2024
69d4b17
More conversions
miscco Oct 16, 2024
4750661
Suppress the unreachable warnings instead
miscco Oct 16, 2024
0cc117c
More conversion
miscco Oct 16, 2024
47de30c
Disable unused variable warning
miscco Oct 16, 2024
c6f66b4
try to work around stupid unreachable warnings
miscco Oct 16, 2024
0428790
Disable more filesystem tests on MSVC
miscco Oct 16, 2024
7483dcd
more conversions...
miscco Oct 16, 2024
0cfe9b1
More conversions
miscco Oct 16, 2024
d16094a
take the static variable from the type not the instance
miscco Oct 16, 2024
1aaff27
actually use the right class
miscco Oct 16, 2024
e96bfd1
Use windows signal handler
miscco Oct 16, 2024
65eec56
More conversions
miscco Oct 16, 2024
5162d73
More shadowing
miscco Oct 16, 2024
66471c6
More conversiosn
miscco Oct 16, 2024
64da8d2
Import the original README.md for CUDASTF into docs/cudax/stf.rst
caugonnet Oct 16, 2024
2659d10
link to stf in cudax doc
caugonnet Oct 16, 2024
a62bbc4
Conversions all over
miscco Oct 16, 2024
27d7591
Shadowing warning
miscco Oct 16, 2024
63e4bf4
More conversions and unused variables
miscco Oct 16, 2024
610671c
Shadowing warnings
miscco Oct 16, 2024
cfa5b34
Updates in stf documentation
caugonnet Oct 16, 2024
5e9902f
stop defining specializations of std::hash, define them in our namespace
caugonnet Oct 16, 2024
4d61327
use our custom hash
caugonnet Oct 16, 2024
479961c
hash fallback to std::hash if not specialized
caugonnet Oct 16, 2024
018c596
Use a fallback on std::hash if our hash is not specialized
caugonnet Oct 16, 2024
4344702
Pass a non ref version of Fun everywhere
caugonnet Oct 16, 2024
1d8e67c
fix a reference in the doc
caugonnet Oct 16, 2024
4fb0313
remove redundant traits
caugonnet Oct 16, 2024
62631eb
Fix a typo
caugonnet Oct 16, 2024
a1b3b23
regexp suitable for the static assertions of different compilers
caugonnet Oct 16, 2024
6d06bc2
Restore changes
miscco Oct 17, 2024
517e32c
Remove unused variable
caugonnet Oct 17, 2024
779cf70
remove an unused task dep
caugonnet Oct 17, 2024
4926252
Remove exclusions in the test which are outdated
caugonnet Oct 17, 2024
948d3bb
avoid type conversions
caugonnet Oct 17, 2024
c05e3dc
Add images for the stf doc
caugonnet Oct 17, 2024
212fc31
import a doc for low level stf api
caugonnet Oct 17, 2024
e6b2664
Fix type conversion issues
caugonnet Oct 17, 2024
9667105
do not use ssize_t
caugonnet Oct 17, 2024
15570b1
do not document callback queues which are experimental
caugonnet Oct 17, 2024
4503cdb
do not use ::std in comments
caugonnet Oct 17, 2024
d229eed
minor type fixes
caugonnet Oct 17, 2024
821b556
do not use M_PI
caugonnet Oct 17, 2024
60003c2
Improvements for STF doc
caugonnet Oct 17, 2024
c32696a
Create a trivial STF example
caugonnet Oct 17, 2024
74bc606
doc updates
caugonnet Oct 17, 2024
94b6186
Doc improvement
caugonnet Oct 17, 2024
d7e8374
Hash for data_place and green ctx view are in our namespace now
caugonnet Oct 17, 2024
dd71c42
move hash for pos4 and dim4
caugonnet Oct 17, 2024
34f0cdd
move hash per_data_instance_msi_state
caugonnet Oct 17, 2024
3981e16
hash namespaces
caugonnet Oct 17, 2024
796c585
add missing header
caugonnet Oct 17, 2024
2ab5686
fix url
caugonnet Oct 17, 2024
ef4c8c4
move hash of tasks
caugonnet Oct 17, 2024
7b3c844
doc fixes
caugonnet Oct 18, 2024
48cad8a
Replace CUDASTF macros by CCCL ones
caugonnet Oct 18, 2024
6fefcb9
Remove unused class task_id_t
caugonnet Oct 18, 2024
988b166
move implementation details into reserved namespace
caugonnet Oct 18, 2024
76dd67b
move implementation details into reserved namespace
caugonnet Oct 18, 2024
ca0fce4
machine => reserved::machine
caugonnet Oct 18, 2024
5c08ffb
more things in reserved::
caugonnet Oct 18, 2024
6157a46
use reserved::
caugonnet Oct 18, 2024
aa97444
use reserved::
caugonnet Oct 18, 2024
1342c95
remove a superfluous reserved::
caugonnet Oct 18, 2024
6f872ec
use reserved:: for hash tables
caugonnet Oct 18, 2024
745f51c
use reserved:: for hash tables
caugonnet Oct 18, 2024
6941302
reserved::
caugonnet Oct 18, 2024
cc654cb
fixes for hashtble
caugonnet Oct 18, 2024
ba81530
simpler code
caugonnet Oct 18, 2024
2fb894d
cached_getenv => reserved::cached_getenv
caugonnet Oct 18, 2024
f5df2ad
task_statistics => reserved::task_statistics
caugonnet Oct 18, 2024
cc60b7d
reserved::customHash
caugonnet Oct 18, 2024
7f2793e
code cleanup with namespaces in slice
caugonnet Oct 18, 2024
970cb09
reserved::handle
caugonnet Oct 18, 2024
42197cd
reserved::tiled_mdspan_shape
caugonnet Oct 18, 2024
da83e9f
use reserved::
caugonnet Oct 18, 2024
a8109fd
remove cudaGraphExecUpdateErrorString which is dead code
caugonnet Oct 18, 2024
680d2d3
use reserved::
caugonnet Oct 18, 2024
6ea9eb6
use reserved::
caugonnet Oct 18, 2024
75bf781
add more reserved and filter RESERVED from doxygen too
caugonnet Oct 18, 2024
a50cbb5
add missing reserved
caugonnet Oct 18, 2024
64af74d
try to fix doxygen for is_synchronizable
caugonnet Oct 18, 2024
ee36d9d
reserved::has_ostream_operator
caugonnet Oct 18, 2024
5c5354e
use reserved::
caugonnet Oct 18, 2024
7d97d17
reserved::buddy_allocator_metadata
caugonnet Oct 18, 2024
df8f646
note about child graphs
caugonnet Oct 18, 2024
538c418
fix doc levels
caugonnet Oct 18, 2024
37926fa
reserved::host_launch_scope
caugonnet Oct 18, 2024
af527df
move more classes in reserved
caugonnet Oct 18, 2024
3cca3dd
TRIVIAL: add group directive to documentation
andralex Oct 18, 2024
c9fa173
Fix build problem and make type_name work with clang, gcc, and msvc
andralex Oct 19, 2024
0be7346
Translate the temptative refcard into some rst text
caugonnet Oct 19, 2024
eab1fb4
minor improvement for doc
caugonnet Oct 20, 2024
97b49f1
Add some doc for thread_hierarchy_spec
caugonnet Oct 20, 2024
95dad9c
minor stuffs for doc
caugonnet Oct 20, 2024
7246269
more stf doc
caugonnet Oct 21, 2024
d6e6bb8
use __device__ in doc
caugonnet Oct 21, 2024
1866c21
reserved::repeat_scope
caugonnet Oct 21, 2024
11b805c
stf doc
caugonnet Oct 21, 2024
f1f7654
Merge branch 'main' into cudastf
caugonnet Oct 21, 2024
eda4cef
Merge branch 'main' into cudastf
caugonnet Oct 21, 2024
65914b4
try an alternative substr that is constexpr on MSVC
caugonnet Oct 21, 2024
3b0a325
keep it stupid and simple on MSVC for type_name
caugonnet Oct 21, 2024
0fdc09c
minor C++ fixes
caugonnet Oct 21, 2024
737c34e
fix an implicit conversion issue
caugonnet Oct 21, 2024
42b5098
more minor C++ fixes
caugonnet Oct 21, 2024
1d9e1f6
temporary hack to use ninja -k 0 on windows
caugonnet Oct 21, 2024
936a63c
provide missing header
caugonnet Oct 21, 2024
fa81ea9
use %zu to print size_t
caugonnet Oct 21, 2024
b3a1490
fix a typo
caugonnet Oct 21, 2024
8c81915
fix unused variable warnings
caugonnet Oct 21, 2024
4b1af94
fix unused variable warnings
caugonnet Oct 21, 2024
ad06b11
Temporarily disable type conversion warnings in MSVC
caugonnet Oct 21, 2024
2788334
proper fix for unused variables
caugonnet Oct 21, 2024
a66581c
use integers for simplicity
caugonnet Oct 21, 2024
e4607b4
do not use ssize_t
caugonnet Oct 21, 2024
d23e47f
do not use ssize_t
caugonnet Oct 21, 2024
b01d0f0
remove some ssize_t
caugonnet Oct 21, 2024
6352347
remove some ssize_t
caugonnet Oct 21, 2024
ad861eb
add missing doxygen block
caugonnet Oct 21, 2024
9854bce
fix a type error
caugonnet Oct 21, 2024
4582a50
fix unused vars
caugonnet Oct 21, 2024
3b01299
fix some var shadowing
caugonnet Oct 21, 2024
c42bf98
workaround some type conversion warning
caugonnet Oct 21, 2024
be78b70
workaround some type conversion warning
caugonnet Oct 21, 2024
f9b81a2
fix a typo
caugonnet Oct 21, 2024
8df7889
simpler code, no need to use the spec
caugonnet Oct 21, 2024
a1dcc4f
fix some misuse of auto
caugonnet Oct 21, 2024
354228a
minor C++ issues
caugonnet Oct 21, 2024
eff7fc0
fix type error
caugonnet Oct 21, 2024
81f873a
tell MSVC to ignore some unreachable code
caugonnet Oct 21, 2024
505b1fa
fix float/double errors
caugonnet Oct 21, 2024
90b35b8
pre-commit run
caugonnet Oct 21, 2024
11cfa23
Add doxygen to task::release
andralex Oct 21, 2024
3b8b79c
skip test on msvc
caugonnet Oct 21, 2024
6506371
pre-commit run
caugonnet Oct 21, 2024
d4b4b36
Define setenv on Windows
andralex Oct 21, 2024
d6e726b
Add inline to setenv
andralex Oct 21, 2024
e194c22
attempts to fix some doc issues
caugonnet Oct 21, 2024
119eb3c
fixed for doxygen
caugonnet Oct 21, 2024
acf9b36
more doxygen
caugonnet Oct 21, 2024
db443a2
Merge branch 'main' into cudastf
caugonnet Oct 22, 2024
bc15516
disable C4459
caugonnet Oct 22, 2024
cb7a4d7
fix an ifdef typo
caugonnet Oct 22, 2024
507fa7e
use reserved namespace and fix doxygen for mv
caugonnet Oct 22, 2024
6375528
try to help doxygen
caugonnet Oct 22, 2024
e463b8c
add missing string header
caugonnet Oct 22, 2024
64b1140
help doxygen
caugonnet Oct 22, 2024
8ef667e
fixes for sphinx
caugonnet Oct 22, 2024
8f4372c
rst improvements
caugonnet Oct 22, 2024
f1d7f33
skip that test with msvc
caugonnet Oct 22, 2024
7f2738f
rst format fixes
caugonnet Oct 22, 2024
8845aa8
Fix some constexpr issue
caugonnet Oct 22, 2024
c839599
Remove CUDASTF_PROVIDE_MDSPAN
caugonnet Oct 22, 2024
b295e19
mention task_fence
caugonnet Oct 22, 2024
eb58140
unused variables
caugonnet Oct 22, 2024
98014f9
use a fallback source_location for old CUDA versions on MSVC
caugonnet Oct 22, 2024
714d11f
cleaner code
caugonnet Oct 22, 2024
7596806
larger parameters to avoid timing issues
caugonnet Oct 22, 2024
1113130
Revert "use a fallback source_location for old CUDA versions on MSVC"
caugonnet Oct 22, 2024
d160e07
workaround some not interesting compiler issue
caugonnet Oct 22, 2024
7b5deb3
fix small C++ issues
caugonnet Oct 22, 2024
ae545f8
Disable MSVC for now
miscco Oct 22, 2024
c33e12b
Fix formatting
miscco Oct 22, 2024
bacff41
No examples for you MSVC
miscco Oct 22, 2024
bacdf44
Work without nvtx if not present
andralex Oct 22, 2024
0f093e7
Update cudax/CMakeLists.txt
alliepiper Oct 22, 2024
b26dcbc
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 23, 2024
34b2c46
Remove overrides
miscco Oct 23, 2024
fc25e83
Avoide pass by value in constant_logical_data
andralex Oct 23, 2024
feb9544
Move setenv from global namespace to cuda::experimental::stf
andralex Oct 23, 2024
37bbaa0
Add assertion to each()
andralex Oct 24, 2024
e7fc537
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 24, 2024
985523e
Avoid useless comparison with 0 warning
miscco Oct 24, 2024
a9f9421
Use std and fix formating
miscco Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ body:
- Thrust
- CUB
- libcu++
- CUDA Experimental (cudax)
- cuda.cooperative (Python)
- cuda.parallel (Python)
- General CCCL
- Infrastructure
- Not sure
Expand Down
17 changes: 14 additions & 3 deletions CMakePresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,18 +62,26 @@
"THRUST_MULTICONFIG_ENABLE_SYSTEM_TBB": true,
"cudax_ENABLE_HEADER_TESTING": true,
"cudax_ENABLE_TESTING": true,
"cudax_ENABLE_EXAMPLES": true,
"cudax_ENABLE_CUDASTF_BOUNDSCHECK": false,
"cudax_ENABLE_CUDASTF_DEBUG": false,
"cudax_ENABLE_CUDASTF_MATHLIBS": false,
"cudax_ENABLE_DIALECT_CPP17": true,
"cudax_ENABLE_DIALECT_CPP20": true
"cudax_ENABLE_DIALECT_CPP20": true,
"CCCL_C_Parallel_ENABLE_TESTING": true,
"CCCL_C_Parallel_ENABLE_HEADER_TESTING": true
}
},
{
"name": "all-dev-debug",
"displayName": "all-dev debug",
"inherits": "all-dev",
"cacheVariables": {
"CCCL_ENABLE_BENCHMARKS": false,
"CMAKE_BUILD_TYPE": "Debug",
"CMAKE_CUDA_FLAGS": "-G"
"CMAKE_CUDA_FLAGS": "-G",
"CCCL_ENABLE_BENCHMARKS": false,
"cudax_ENABLE_CUDASTF_BOUNDSCHECK": true,
"cudax_ENABLE_CUDASTF_DEBUG": true
}
},
{
Expand Down Expand Up @@ -295,6 +303,9 @@
"CCCL_ENABLE_CUDAX": true,
"cudax_ENABLE_HEADER_TESTING": true,
"cudax_ENABLE_TESTING": true,
"cudax_ENABLE_EXAMPLES": true,
"cudax_ENABLE_CUDASTF_BOUNDSCHECK": false,
"cudax_ENABLE_CUDASTF_MATHLIBS": false,
"cudax_ENABLE_DIALECT_CPP17": false,
"cudax_ENABLE_DIALECT_CPP20": false
}
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ conda install -c conda-forge cccl
CCCL uses [CMake](https://cmake.org/) for all build and installation infrastructure, including tests as well as targets to link against in other CMake projects.
Therefore, CMake is the recommended way to integrate CCCL into another project.

For a complete example of how to do this using CMake Package Manager see [our example project](examples/example_project).
For a complete example of how to do this using CMake Package Manager see [our basic example project](examples/basic).

Other build systems should work, but only CMake is tested.
Contributions to simplify integrating CCCL into other build systems are welcome.
Expand Down
2 changes: 1 addition & 1 deletion c/parallel/src/reduce.cu
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ extern "C" CCCL_C_API CUresult cccl_device_reduce_build(
: std::format(
"extern \"C\" __device__ {3} {4}(const void *self_ptr);\n"
"extern \"C\" __device__ void {5}(void *self_ptr, {0} offset);\n"
"struct __align__({2}) input_iterator_state_t {{\n;"
"struct __align__({2}) input_iterator_state_t {{\n"
" using iterator_category = cuda::std::random_access_iterator_tag;\n"
" using value_type = {3};\n"
" using difference_type = {0};\n"
Expand Down
2 changes: 1 addition & 1 deletion c/parallel/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ function(cccl_c_parallel_add_test target_name_var source)

target_compile_definitions(${target_name} PRIVATE
TEST_CUB_PATH="-I${CCCL_SOURCE_DIR}/cub"
TEST_THRUST_PATH="-I${CCCL_SOURCE_DIR}/cub"
TEST_THRUST_PATH="-I${CCCL_SOURCE_DIR}/thrust"
TEST_LIBCUDACXX_PATH="-I${CCCL_SOURCE_DIR}/libcudacxx/include"
TEST_CTK_PATH="-I${CUDAToolkit_INCLUDE_DIRS}"
)
Expand Down
2 changes: 1 addition & 1 deletion c/parallel/test/c2h.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ static std::string compile(const std::string& source)
REQUIRE(NVRTC_SUCCESS == nvrtcGetLTOIR(prog, ltoir.get()));
REQUIRE(NVRTC_SUCCESS == nvrtcDestroyProgram(&prog));

return std::string(ltoir.release(), ltoir_size);
return std::string(ltoir.get(), ltoir_size);
}

template <class T>
Expand Down
2 changes: 1 addition & 1 deletion ci/build_common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ function usage {
echo " $ PARALLEL_LEVEL=8 $0"
echo " $ PARALLEL_LEVEL=8 $0 -cxx g++-9"
echo " $ $0 -cxx clang++-8"
echo " $ $0 -configure -arch=80"
echo " $ $0 -configure -arch 80"
echo " $ $0 -cxx g++-8 -std 14 -arch 80-real -v -cuda /usr/local/bin/nvcc"
echo " $ $0 -cmake-options \"-DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS=-Wfatal-errors\""
exit 1
Expand Down
12 changes: 12 additions & 0 deletions ci/matrix.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,18 @@ workflows:
# - {jobs: ['test'], project: 'thrust', std: 17, ctk: 'curr', cxx: ['gcc12', 'llvm16']}
#
override:
- {jobs: ['build'], project: 'cudax', ctk: ['12.0', 'curr'], std: 'all', cxx: ['gcc9', 'gcc10', 'gcc11']}
- {jobs: ['build'], project: 'cudax', ctk: ['12.0', 'curr'], std: 'all', cxx: ['clang9', 'clang10', 'clang11', 'clang12', 'clang13']}
- {jobs: ['build'], project: 'cudax', ctk: [ 'curr'], std: 'all', cxx: ['clang14', 'clang15', 'clang16', 'clang17']}
- {jobs: ['build'], project: 'cudax', ctk: ['12.0', ], std: 20, cxx: ['msvc14.36']}
- {jobs: ['build'], project: 'cudax', ctk: [ 'curr'], std: 20, cxx: ['msvc2022']}
- {jobs: ['build'], project: 'cudax', ctk: ['12.0' ], std: 17, cxx: ['gcc12'], sm: "90"}
- {jobs: ['build'], project: 'cudax', ctk: [ 'curr'], std: 17, cxx: ['gcc13'], sm: "90a"}
- {jobs: ['build'], project: 'cudax', ctk: [ 'curr'], std: 'all', cxx: ['gcc13', 'clang16'], cpu: 'arm64'}
- {jobs: ['test'], project: 'cudax', ctk: ['12.0' ], std: 'min', cxx: ['gcc12']}
- {jobs: ['test'], project: 'cudax', ctk: [ 'curr'], std: 'all', cxx: ['gcc12']}
- {jobs: ['test'], project: 'cudax', ctk: ['12.0' ], std: 'max', cxx: ['clang14']}
- {jobs: ['test'], project: 'cudax', ctk: [ 'curr'], std: 'max', cxx: ['clang18']}
miscco marked this conversation as resolved.
Show resolved Hide resolved
miscco marked this conversation as resolved.
Show resolved Hide resolved

pull_request:
# Old CTK
Expand Down
7 changes: 7 additions & 0 deletions cmake/CCCLConfigureTarget.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ function(cccl_configure_target target_name)

get_target_property(type ${target_name} TYPE)

set_target_properties(${target_name}
PROPERTIES
# Disable compiler extensions:
CXX_EXTENSIONS OFF
CUDA_EXTENSIONS OFF
)

if (DEFINED CCT_DIALECT)
set_target_properties(${target_name}
PROPERTIES
Expand Down
2 changes: 1 addition & 1 deletion cub/cub/device/dispatch/dispatch_batch_memcpy.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -455,7 +455,7 @@ struct DispatchBatchMemcpy : SelectedPolicy
// The number of thread blocks (or tiles) required to process all of the given buffers
BlockOffsetT num_tiles = ::cuda::ceil_div(num_buffers, TILE_SIZE);

using BlevBufferSrcsOutT = ::cuda::std::_If<IsMemcpy, void*, cub::detail::value_t<InputBufferIt>>;
using BlevBufferSrcsOutT = ::cuda::std::_If<IsMemcpy, const void*, cub::detail::value_t<InputBufferIt>>;
using BlevBufferDstOutT = ::cuda::std::_If<IsMemcpy, void*, cub::detail::value_t<OutputBufferIt>>;
using BlevBufferSrcsOutItT = BlevBufferSrcsOutT*;
using BlevBufferDstsOutItT = BlevBufferDstOutT*;
Expand Down
4 changes: 2 additions & 2 deletions cub/cub/device/dispatch/dispatch_merge.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ namespace detail
{
namespace merge
{
_LIBCUDACXX_INLINE_VAR constexpr int fallback_BLOCK_THREADS = 64;
_LIBCUDACXX_INLINE_VAR constexpr int fallback_ITEMS_PER_THREAD = 1;
_CCCL_INLINE_VAR constexpr int fallback_BLOCK_THREADS = 64;
_CCCL_INLINE_VAR constexpr int fallback_ITEMS_PER_THREAD = 1;

template <typename DefaultPolicy, class... Args>
class choose_merge_agent
Expand Down
9 changes: 1 addition & 8 deletions cub/cub/device/dispatch/dispatch_select_if.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -237,13 +237,6 @@ struct agent_select_if_wrapper_t
/******************************************************************************
* Kernel entry points
*****************************************************************************/
// TODO (elstehle) gird-private constants were introduced in CTK 11.7. The macro is temporarily placed here, do we want
// to make this a CCCL macro?
#if defined(_CCCL_CUDACC_BELOW_11_7) || (CUB_PTX_ARCH < 700)
# define _CUB_GRID_CONSTANT
#else
# define _CUB_GRID_CONSTANT __grid_constant__
#endif

/**
* Select kernel entry point (multi-block)
Expand Down Expand Up @@ -356,7 +349,7 @@ __launch_bounds__(int(
EqualityOpT equality_op,
OffsetT num_items,
int num_tiles,
_CUB_GRID_CONSTANT const StreamingContextT streaming_context,
_CCCL_GRID_CONSTANT const StreamingContextT streaming_context,
cub::detail::vsmem_t vsmem)
{
using VsmemHelperT = cub::detail::vsmem_helper_default_fallback_policy_t<
Expand Down
23 changes: 11 additions & 12 deletions cub/cub/iterator/arg_index_input_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ CUB_NAMESPACE_BEGIN
* indices (forming \p KeyValuePair tuples).
*
* @par Overview
* - ArgIndexInputIteratorTwraps a random access input iterator @p itr of type @p InputIteratorT.
* Dereferencing an ArgIndexInputIteratorTat offset @p i produces a @p KeyValuePair value whose
* - ArgIndexInputIterator wraps a random access input iterator @p itr of type @p InputIteratorT.
* Dereferencing an ArgIndexInputIterator at offset @p i produces a @p KeyValuePair value whose
* @p key field is @p i and whose @p value field is <tt>itr[i]</tt>.
* - Can be used with any data type.
* - Can be constructed, manipulated, and exchanged within and between host and device
Expand All @@ -74,7 +74,7 @@ CUB_NAMESPACE_BEGIN
* - Compatible with Thrust API v1.7 or newer.
*
* @par Snippet
* The code snippet below illustrates the use of @p ArgIndexInputIteratorTto
* The code snippet below illustrates the use of @p ArgIndexInputIterator to
* dereference an array of doubles
* @par
* @code
Expand All @@ -87,17 +87,16 @@ CUB_NAMESPACE_BEGIN
* cub::ArgIndexInputIterator<double*> itr(d_in);
*
* // Within device code:
* using Tuple = typename cub::ArgIndexInputIterator<double*>::value_type;
* Tuple item_offset_pair.key = *itr;
* printf("%f @ %d\n",
* item_offset_pair.value,
* item_offset_pair.key); // 8.0 @ 0
* cub::ArgIndexInputIterator<double*>::value_type tup = *itr;
* printf("%f @ %ld\n",
* tup.value,
* tup.key); // 8.0 @ 0
*
* itr = itr + 6;
* item_offset_pair.key = *itr;
* printf("%f @ %d\n",
* item_offset_pair.value,
* item_offset_pair.key); // 9.0 @ 6
* tup = *itr;
* printf("%f @ %ld\n",
* tup.value,
* tup.key); // 9.0 @ 6
*
* @endcode
*
Expand Down
4 changes: 2 additions & 2 deletions cub/cub/iterator/constant_input_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -61,15 +61,15 @@ CUB_NAMESPACE_BEGIN
* @brief A random-access input generator for dereferencing a sequence of homogeneous values
*
* @par Overview
* - Read references to a ConstantInputIteratorTiterator always return the supplied constant
* - Read references to a ConstantInputIterator always return the supplied constant
* of type @p ValueType.
* - Can be used with any data type.
* - Can be constructed, manipulated, dereferenced, and exchanged within and between host and device
* functions.
* - Compatible with Thrust API v1.7 or newer.
*
* @par Snippet
* The code snippet below illustrates the use of @p ConstantInputIteratorTto
* The code snippet below illustrates the use of @p ConstantInputIterator to
* dereference a sequence of homogeneous doubles.
* @par
* @code
Expand Down
4 changes: 2 additions & 2 deletions cub/cub/iterator/counting_input_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,14 @@ CUB_NAMESPACE_BEGIN
* @brief A random-access input generator for dereferencing a sequence of incrementing integer values.
*
* @par Overview
* - After initializing a CountingInputIteratorTto a certain integer @p base, read references
* - After initializing a CountingInputIterator to a certain integer @p base, read references
* at @p offset will return the value @p base + @p offset.
* - Can be constructed, manipulated, dereferenced, and exchanged within and between host and device
* functions.
* - Compatible with Thrust API v1.7 or newer.
*
* @par Snippet
* The code snippet below illustrates the use of @p CountingInputIteratorTto
* The code snippet below illustrates the use of @p CountingInputIterator to
* dereference a sequence of incrementing integers.
* @par
* @code
Expand Down
4 changes: 2 additions & 2 deletions cub/cub/iterator/transform_input_iterator.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ CUB_NAMESPACE_BEGIN
* @brief A random-access input wrapper for transforming dereferenced values.
*
* @par Overview
* - TransformInputIteratorTwraps a unary conversion functor of type
* - TransformInputIterator wraps a unary conversion functor of type
* @p ConversionOp and a random-access input iterator of type <tt>InputIteratorT</tt>,
* using the former to produce references of type @p ValueType from the latter.
* - Can be used with any data type.
Expand All @@ -71,7 +71,7 @@ CUB_NAMESPACE_BEGIN
* - Compatible with Thrust API v1.7 or newer.
*
* @par Snippet
* The code snippet below illustrates the use of @p TransformInputIteratorTto
* The code snippet below illustrates the use of @p TransformInputIterator to
* dereference an array of integers, tripling the values and converting them to doubles.
* @par
* @code
Expand Down
2 changes: 1 addition & 1 deletion cub/cub/util_type.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -458,7 +458,7 @@ struct CubVector
};

/// The maximum number of elements in CUDA vector types
_LIBCUDACXX_INLINE_VAR constexpr int MAX_VEC_ELEMENTS = 4;
_CCCL_INLINE_VAR constexpr int MAX_VEC_ELEMENTS = 4;

/**
* Generic vector-1 type
Expand Down
2 changes: 1 addition & 1 deletion cub/test/catch2_test_launch_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@

#define DECLARE_LAUNCH_WRAPPER(API, WRAPPED_API_NAME) \
DECLARE_INVOCABLE(API, WRAPPED_API_NAME, , ); \
_LIBCUDACXX_INLINE_VAR constexpr struct WRAPPED_API_NAME##_t \
_CCCL_INLINE_VAR constexpr struct WRAPPED_API_NAME##_t \
{ \
template <class... As> \
void operator()(As... args) const \
Expand Down
7 changes: 4 additions & 3 deletions cub/test/test_device_batch_memcpy.cu
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,8 @@ void RunTest(BufferOffsetT num_buffers,
{
try
{
using SrcPtrT = uint8_t*;
using SrcPtrT = const uint8_t*;
using DstPtrT = uint8_t*;

// Buffer segment data (their offsets and sizes)
c2h::host_vector<BufferSizeT> h_buffer_sizes(num_buffers);
Expand Down Expand Up @@ -286,8 +287,8 @@ void RunTest(BufferOffsetT num_buffers,
thrust::raw_pointer_cast(d_buffer_src_offsets.data()), src_transform_op);

// Prepare d_buffer_dsts
OffsetToPtrOp<SrcPtrT> dst_transform_op{static_cast<SrcPtrT>(thrust::raw_pointer_cast(d_out.data()))};
cub::TransformInputIterator<SrcPtrT, OffsetToPtrOp<SrcPtrT>, ByteOffsetT*> d_buffer_dsts(
OffsetToPtrOp<DstPtrT> dst_transform_op{static_cast<DstPtrT>(thrust::raw_pointer_cast(d_out.data()))};
cub::TransformInputIterator<DstPtrT, OffsetToPtrOp<DstPtrT>, ByteOffsetT*> d_buffer_dsts(
thrust::raw_pointer_cast(d_buffer_dst_offsets.data()), dst_transform_op);

// Get temporary storage requirements
Expand Down
16 changes: 14 additions & 2 deletions cudax/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,19 @@ endif()

option(cudax_ENABLE_HEADER_TESTING "Test that CUDA Experimental's public headers compile." ON)
option(cudax_ENABLE_TESTING "Build CUDA Experimental's tests." ON)
option(cudax_ENABLE_EXAMPLES "Build CUDA Experimental's tests." ON)
alliepiper marked this conversation as resolved.
Show resolved Hide resolved
option(cudax_ENABLE_CUDASTF_BOUNDSCHECK "Enable bounds checks for STF targets. Requires debug build." OFF)
option(cudax_ENABLE_CUDASTF_DEBUG "Enable additional debugging for STF targets. Requires debug build." OFF)
option(cudax_ENABLE_CUDASTF_MATHLIBS "Enable STF tests/examples that use cublas/cusolver." OFF)

if ((cudax_ENABLE_CUDASTF_BOUNDSCHECK OR cudax_ENABLE_CUDASTF_DEBUG) AND
NOT CMAKE_BUILD_TYPE MATCHES "Debug" AND NOT CMAKE_BUILD_TYPE MATCHES "RelWithDebInfo")
message(FATAL_ERROR "cudax_ENABLE_CUDASTF_BOUNDSCHECK and cudax_ENABLE_CUDASTF_DEBUG require a Debug build.")
endif()

include(cmake/cudaxBuildCompilerTargets.cmake)
include(cmake/cudaxBuildTargetList.cmake)
include(cmake/cudaxSTFConfigureTarget.cmake)

cudax_build_compiler_targets()
cudax_build_target_list()
Expand All @@ -28,7 +38,9 @@ if (cudax_ENABLE_HEADER_TESTING)
endif()

if (cudax_ENABLE_TESTING)
include(CTest)
enable_testing() # Must be in root directory
alliepiper marked this conversation as resolved.
Show resolved Hide resolved
add_subdirectory(test)
endif()

if (cudax_ENABLE_EXAMPLES)
add_subdirectory(examples)
endif()
12 changes: 12 additions & 0 deletions cudax/cmake/cudaxBuildCompilerTargets.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,18 @@ function(cudax_build_compiler_targets)
# cudax requires dim3 to be usable from a constexpr context, and the CUDART headers require
# __cplusplus to be defined for this to work:
append_option_if_available("/Zc:__cplusplus" cxx_compile_options)

# cudax requires __VA_OPT__ for its unit tests
append_option_if_available("/Zc:preprocessor" cxx_compile_options)

# stf used getenv which is potentially unsafe but not in our context
list(APPEND cxx_compile_definitions "_CRT_SECURE_NO_WARNINGS")
endif()

if("Clang" STREQUAL "${CMAKE_CXX_COMPILER_ID}")
# stf heavily uses host device lambdas which break on clang due to a warning about the implicitly
# deleted copy constructor
append_option_if_available("-Wno-deprecated-copy" cxx_compile_options)
endif()

cccl_build_compiler_interface(cudax.compiler_interface
Expand Down
Loading
Loading