Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify kernel dispatch paths for device reduce between CUB and c.parallel. #2591

Merged
merged 8 commits into from
Oct 23, 2024
2 changes: 2 additions & 0 deletions c/parallel/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ target_link_libraries(cccl.c.parallel PRIVATE
CUDA::nvJitLink
CUDA::cuda_driver
cccl.compiler_interface_cpp20
CUB::CUB
Thrust::Thrust
)
target_compile_definitions(cccl.c.parallel PUBLIC CCCL_C_EXPERIMENTAL=1)
target_compile_definitions(cccl.c.parallel PRIVATE NVRTC_GET_TYPE_NAME=1)
Expand Down
1 change: 1 addition & 0 deletions c/parallel/include/cccl/c/reduce.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ struct cccl_device_reduce_build_result_t
void* cubin;
size_t cubin_size;
CUlibrary library;
unsigned long long accumulator_size;
CUkernel single_tile_kernel;
CUkernel single_tile_second_kernel;
CUkernel reduction_kernel;
Expand Down
Loading
Loading