Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace cub::ArrayWrapper by cuda::std::array and deprecate it #1764

Merged
merged 2 commits into from
May 28, 2024

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented May 21, 2024

The type cub::ArrayWrapper is only used by CUB's histogram algorithms and can be replaced by cuda::std::array, which offers a superset of the functionality.

  • Check generated SASS is the same

The generated SASS is identical before and after this PR, except for the mangling of function names.

@bernhardmgruber bernhardmgruber force-pushed the arraywrapper branch 2 times, most recently from 8feecc0 to 85a7610 Compare May 21, 2024 07:16
Copy link
Contributor

🟨 CI Results [ Failed: 8 | Passed: 190 | Total: 198 ]
  • 🟩 Project thrust [ Failed: 0 | Passed: 99 | Total: 99 ]

    🟩 cpu
      🟩 amd64 (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 arm64 (0% Fail)              Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 ctk
      🟩 11.1 (0% Fail)               Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 11.8 (0% Fail)               Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 12.4 (0% Fail)               Failed:  0  -- Passed: 81  -- Total: 81 
    🟩 cudacxx_full
      🟩 clang-cuda16 (0% Fail)       Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc11.1 (0% Fail)           Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 nvcc11.8 (0% Fail)           Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 nvcc12.4 (0% Fail)           Failed:  0  -- Passed: 79  -- Total: 79 
    🟩 cudacxx_name
      🟩 clang-cuda (0% Fail)         Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc (0% Fail)               Failed:  0  -- Passed: 97  -- Total: 97 
    🟩 cxx_full
      🟩 clang9 (0% Fail)             Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 clang10 (0% Fail)            Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 clang11 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang12 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang13 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang14 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang15 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang16 (0% Fail)            Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 gcc6 (0% Fail)               Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 gcc7 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc8 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc9 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc10 (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 gcc11 (0% Fail)              Failed:  0  -- Passed:  7  -- Total:  7 
      🟩 gcc12 (0% Fail)              Failed:  0  -- Passed: 16  -- Total: 16 
      🟩 Intel2023.2.0 (0% Fail)      Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC14.16 (0% Fail)          Failed:  0  -- Passed:  1  -- Total:  1 
      🟩 MSVC14.29 (0% Fail)          Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 MSVC14.39 (0% Fail)          Failed:  0  -- Passed:  3  -- Total:  3 
    🟩 cxx_name
      🟩 clang (0% Fail)              Failed:  0  -- Passed: 43  -- Total: 43 
      🟩 gcc (0% Fail)                Failed:  0  -- Passed: 47  -- Total: 47 
      🟩 Intel (0% Fail)              Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 gpu
      🟩 v100 (0% Fail)               Failed:  0  -- Passed: 99  -- Total: 99 
    🟩 jobs
      🟩 build (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 test (0% Fail)               Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 os
      🟩 ubuntu18.04 (0% Fail)        Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 ubuntu20.04 (0% Fail)        Failed:  0  -- Passed: 35  -- Total: 35 
      🟩 ubuntu22.04 (0% Fail)        Failed:  0  -- Passed: 44  -- Total: 44 
      🟩 windows2022 (0% Fail)        Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 sm
      🟩 60;70;80;90 (0% Fail)        Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 90a (0% Fail)                Failed:  0  -- Passed:  4  -- Total:  4 
    🟩 std
      🟩 11 (0% Fail)                 Failed:  0  -- Passed: 26  -- Total: 26 
      🟩 14 (0% Fail)                 Failed:  0  -- Passed: 29  -- Total: 29 
      🟩 17 (0% Fail)                 Failed:  0  -- Passed: 28  -- Total: 28 
      🟩 20 (0% Fail)                 Failed:  0  -- Passed: 16  -- Total: 16 
    
  • 🟨 Project cub [ Failed: 8 | Passed: 91 | Total: 99 ]

    🔍 cpu: amd64 🔍
      🔍 amd64 (8% Fail)              Failed:  8  -- Passed: 83  -- Total: 91 
      🟩 arm64 (0% Fail)              Failed:  0  -- Passed:  8  -- Total:  8 
    🔍 ctk: 12.4 🔍
      🟩 11.1 (0% Fail)               Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 11.8 (0% Fail)               Failed:  0  -- Passed:  3  -- Total:  3 
      🔍 12.4 (9% Fail)               Failed:  8  -- Passed: 73  -- Total: 81 
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda16 (0% Fail)       Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc11.1 (0% Fail)           Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 nvcc11.8 (0% Fail)           Failed:  0  -- Passed:  3  -- Total:  3 
      🔍 nvcc12.4 (10% Fail)          Failed:  8  -- Passed: 71  -- Total: 79 
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda (0% Fail)         Failed:  0  -- Passed:  2  -- Total:  2 
      🔍 nvcc (8% Fail)               Failed:  8  -- Passed: 89  -- Total: 97 
    🚨 jobs: test 🚨
      🟩 build (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🔥 test (100% Fail)             Failed:  8  -- Passed:  0  -- Total:  8 
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04 (0% Fail)        Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 ubuntu20.04 (0% Fail)        Failed:  0  -- Passed: 35  -- Total: 35 
      🔍 ubuntu22.04 (18% Fail)       Failed:  8  -- Passed: 36  -- Total: 44 
      🟩 windows2022 (0% Fail)        Failed:  0  -- Passed:  6  -- Total:  6 
    🟨 cxx_full
      🟩 clang9 (0% Fail)             Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 clang10 (0% Fail)            Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 clang11 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang12 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang13 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang14 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang15 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟨 clang16 (28% Fail)           Failed:  4  -- Passed: 10  -- Total: 14 
      🟩 gcc6 (0% Fail)               Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 gcc7 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc8 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc9 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc10 (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 gcc11 (0% Fail)              Failed:  0  -- Passed:  7  -- Total:  7 
      🟨 gcc12 (25% Fail)             Failed:  4  -- Passed: 12  -- Total: 16 
      🟩 Intel2023.2.0 (0% Fail)      Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC14.16 (0% Fail)          Failed:  0  -- Passed:  1  -- Total:  1 
      🟩 MSVC14.29 (0% Fail)          Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 MSVC14.39 (0% Fail)          Failed:  0  -- Passed:  3  -- Total:  3 
    🟨 cxx_name
      🟨 clang (9% Fail)              Failed:  4  -- Passed: 39  -- Total: 43 
      🟨 gcc (8% Fail)                Failed:  4  -- Passed: 43  -- Total: 47 
      🟩 Intel (0% Fail)              Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
    🟨 std
      🟨 11 (7% Fail)                 Failed:  2  -- Passed: 24  -- Total: 26 
      🟨 14 (6% Fail)                 Failed:  2  -- Passed: 27  -- Total: 29 
      🟨 17 (7% Fail)                 Failed:  2  -- Passed: 26  -- Total: 28 
      🟨 20 (12% Fail)                Failed:  2  -- Passed: 14  -- Total: 16 
    🟨 gpu
      🟨 v100 (8% Fail)               Failed:  8  -- Passed: 91  -- Total: 99 
    🟩 sm
      🟩 60;70;80;90 (0% Fail)        Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 90a (0% Fail)                Failed:  0  -- Passed:  4  -- Total:  4 
    

🏃‍ Runner counts (total jobs: 198)

# Runner
154 linux-amd64-cpu16
16 linux-arm64-cpu16
16 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exciting to see CUB types being replaced by standard utilities!
This change seems to lead to many codegen differences (many kernels are just BPT.TRAP 0x1 now). Before getting this PR out of draft, could you make sure that the resulting SASS matches one without your modifications? That'd save us a lot of time on benchmarking.

@@ -885,7 +885,7 @@ struct KeyValuePair<K, V, false, true>
* \brief A wrapper for passing simple static arrays as kernel parameters
*/
template <typename T, int COUNT>
struct ArrayWrapper
struct CUB_DEPRECATED ArrayWrapper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important: we tend to document deprecation in a way that orient users on what to use instead, especially given that this wrapper is used by clients:

   * @deprecated [Since 2.5.0] The `cub::ArrayWrapper` is deprecated. 
   *   Use `cuda::std::array` instead.

@miscco if I remember correctly, there were some issues with @deprecated on Sphinx end, is it still the case?
@jrhemstad if I remember correctly, we tried to postpone all deprecations until a major release to avoid source compatibility issues on minor CTK updates. Is it still the recommended way or we can relax this requirement given the small ArrayWrapper user base?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we tried to postpone all deprecations until a major release to avoid source compatibility issues on minor CTK updates. Is it still the recommended way or we can relax this requirement given the small ArrayWrapper user base?

Adding new deprecation warnings in a minor release is fair game as far as I'm concerned. We should just make sure to include in the emitted warning to tell people how to turn the warning off.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gevtushenko yes currently we cannot build any \deprecated and @deprecated doxygen comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miscco what does this mean for this PR now? Can I add a @deprecated bla bla comment now, or not?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will break our docs soon, so I believe you should do deprecated bla bla so that it is there but is not picked up by doxygen

That way when we fix this it will be easy to bring in

cub/cub/agent/agent_histogram.cuh Outdated Show resolved Hide resolved
Copy link
Contributor

🟨 CI Results: Pass: 95%/198 | Total Time: 4d 04h | Avg Time: 30m 30s | Hits: 18%/112452
  • 🟩 thrust: Pass: 100%/99 | Total Time: 1d 18h | Avg Time: 25m 44s | Hits: 22%/50817

    🟩 cpu
      🟩 amd64              Pass: 100%/91  | Total Time:  1d 15h | Avg Time: 25m 46s | Hits:  22%/46709 
      🟩 arm64              Pass: 100%/8   | Total Time:  3h 21m | Avg Time: 25m 13s | Hits:  15%/4108  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total Time:  6h 01m | Avg Time: 24m 06s | Hits:  15%/7700  
      🟩 11.8               Pass: 100%/3   | Total Time:  1h 42m | Avg Time: 34m 00s | Hits:  15%/1542  
      🟩 12.4               Pass: 100%/81  | Total Time:  1d 10h | Avg Time: 25m 43s | Hits:  23%/41575 
    🟩 cudacxx_full
      🟩 clang-cuda16       Pass: 100%/2   | Total Time: 46m 20s | Avg Time: 23m 10s | Hits:  13%/1026  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  6h 01m | Avg Time: 24m 06s | Hits:  15%/7700  
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  1h 42m | Avg Time: 34m 00s | Hits:  15%/1542  
      🟩 nvcc12.4           Pass: 100%/79  | Total Time:  1d 09h | Avg Time: 25m 47s | Hits:  23%/40549 
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total Time: 46m 20s | Avg Time: 23m 10s | Hits:  13%/1026  
      🟩 nvcc               Pass: 100%/97  | Total Time:  1d 17h | Avg Time: 25m 47s | Hits:  22%/49791 
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  2h 24m | Avg Time: 24m 04s | Hits:  15%/3078  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 19m | Avg Time: 26m 29s | Hits:  15%/1539  
      🟩 clang11            Pass: 100%/4   | Total Time:  1h 40m | Avg Time: 25m 12s | Hits:  15%/2052  
      🟩 clang12            Pass: 100%/4   | Total Time:  1h 41m | Avg Time: 25m 27s | Hits:  15%/2052  
      🟩 clang13            Pass: 100%/4   | Total Time:  1h 37m | Avg Time: 24m 29s | Hits:  15%/2052  
      🟩 clang14            Pass: 100%/4   | Total Time:  1h 43m | Avg Time: 25m 53s | Hits:  15%/2052  
      🟩 clang15            Pass: 100%/4   | Total Time:  1h 41m | Avg Time: 25m 20s | Hits:  15%/2052  
      🟩 clang16            Pass: 100%/14  | Total Time:  4h 52m | Avg Time: 20m 51s | Hits:  39%/7182  
      🟩 gcc6               Pass: 100%/2   | Total Time: 43m 49s | Avg Time: 21m 54s | Hits:  16%/1026  
      🟩 gcc7               Pass: 100%/6   | Total Time:  2h 25m | Avg Time: 24m 18s | Hits:  15%/3084  
      🟩 gcc8               Pass: 100%/6   | Total Time:  2h 30m | Avg Time: 25m 02s | Hits:  15%/3084  
      🟩 gcc9               Pass: 100%/6   | Total Time:  2h 29m | Avg Time: 24m 56s | Hits:  15%/3084  
      🟩 gcc10              Pass: 100%/4   | Total Time:  1h 47m | Avg Time: 26m 53s | Hits:  15%/2056  
      🟩 gcc11              Pass: 100%/7   | Total Time:  3h 28m | Avg Time: 29m 49s | Hits:  15%/3598  
      🟩 gcc12              Pass: 100%/16  | Total Time:  5h 22m | Avg Time: 20m 09s | Hits:  36%/8224  
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  1h 38m | Avg Time: 32m 44s | Hits:  16%/1548  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 45m 05s | Avg Time: 45m 05s | Hits:  11%/509   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 38m | Avg Time: 49m 11s | Hits:  11%/1018  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 36m | Avg Time: 52m 05s | Hits:  11%/1527  
    🟩 cxx_name
      🟩 clang              Pass: 100%/43  | Total Time: 17h 01m | Avg Time: 23m 45s | Hits:  23%/22059 
      🟩 gcc                Pass: 100%/47  | Total Time: 18h 48m | Avg Time: 24m 00s | Hits:  22%/24156 
      🟩 Intel              Pass: 100%/3   | Total Time:  1h 38m | Avg Time: 32m 44s | Hits:  16%/1548  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 59m | Avg Time: 49m 57s | Hits:  11%/3054  
    🟩 gpu
      🟩 v100               Pass: 100%/99  | Total Time:  1d 18h | Avg Time: 25m 44s | Hits:  22%/50817 
    🟩 jobs
      🟩 build              Pass: 100%/91  | Total Time:  1d 16h | Avg Time: 26m 57s | Hits:  15%/46709 
      🟩 test               Pass: 100%/8   | Total Time:  1h 35m | Avg Time: 11m 52s | Hits:  99%/4108  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  5h 16m | Avg Time: 22m 36s | Hits:  15%/7191  
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time: 15h 08m | Avg Time: 25m 57s | Hits:  15%/17968 
      🟩 ubuntu22.04        Pass: 100%/44  | Total Time: 17h 02m | Avg Time: 23m 14s | Hits:  30%/22604 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 59m | Avg Time: 49m 57s | Hits:  11%/3054  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  1h 42m | Avg Time: 34m 00s | Hits:  15%/1542  
      🟩 90a                Pass: 100%/4   | Total Time:  1h 01m | Avg Time: 15m 16s | Hits:  15%/2056  
    🟩 std
      🟩 11                 Pass: 100%/26  | Total Time:  9h 20m | Avg Time: 21m 32s | Hits:  26%/13354 
      🟩 14                 Pass: 100%/29  | Total Time: 13h 02m | Avg Time: 26m 58s | Hits:  19%/14881 
      🟩 17                 Pass: 100%/28  | Total Time: 12h 57m | Avg Time: 27m 45s | Hits:  19%/14372 
      🟩 20                 Pass: 100%/16  | Total Time:  7h 08m | Avg Time: 26m 45s | Hits:  24%/8210  
    
  • 🟨 cub: Pass: 91%/99 | Total Time: 2d 10h | Avg Time: 35m 17s | Hits: 15%/61635

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  91%/91  | Total Time:  2d 05h | Avg Time: 35m 33s | Hits:  15%/56003 
      🟩 arm64              Pass: 100%/8   | Total Time:  4h 17m | Avg Time: 32m 12s | Hits:  17%/5632  
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total Time:  6h 40m | Avg Time: 26m 41s | Hits:  11%/9350  
      🟩 11.8               Pass: 100%/3   | Total Time:  2h 10m | Avg Time: 43m 30s | Hits:  16%/2112  
      🔍 12.4               Pass:  90%/81  | Total Time:  2d 01h | Avg Time: 36m 34s | Hits:  16%/50173 
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda16       Pass: 100%/2   | Total Time: 44m 10s | Avg Time: 22m 05s | Hits:  12%/1116  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  6h 40m | Avg Time: 26m 41s | Hits:  11%/9350  
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  2h 10m | Avg Time: 43m 30s | Hits:  16%/2112  
      🔍 nvcc12.4           Pass:  89%/79  | Total Time:  2d 00h | Avg Time: 36m 56s | Hits:  16%/49057 
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total Time: 44m 10s | Avg Time: 22m 05s | Hits:  12%/1116  
      🔍 nvcc               Pass:  91%/97  | Total Time:  2d 09h | Avg Time: 35m 33s | Hits:  15%/60519 
    🚨 jobs: test 🚨
      🟩 build              Pass: 100%/91  | Total Time:  2d 00h | Avg Time: 32m 15s | Hits:  15%/61635 
      🔥 test               Pass:   0%/8   | Total Time:  9h 18m | Avg Time:  1h 09m
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  5h 50m | Avg Time: 25m 03s | Hits:  11%/8801  
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time: 19h 22m | Avg Time: 33m 13s | Hits:  17%/24710 
      🔍 ubuntu22.04        Pass:  81%/44  | Total Time:  1d 04h | Avg Time: 38m 49s | Hits:  16%/24830 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 31m | Avg Time: 45m 14s | Hits:  10%/3294  
    🟨 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  2h 56m | Avg Time: 29m 28s | Hits:  14%/4002  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 40m | Avg Time: 33m 39s | Hits:  17%/2118  
      🟩 clang11            Pass: 100%/4   | Total Time:  2h 08m | Avg Time: 32m 08s | Hits:  17%/2824  
      🟩 clang12            Pass: 100%/4   | Total Time:  2h 09m | Avg Time: 32m 18s | Hits:  17%/2824  
      🟩 clang13            Pass: 100%/4   | Total Time:  2h 08m | Avg Time: 32m 08s | Hits:  17%/2824  
      🟩 clang14            Pass: 100%/4   | Total Time:  2h 11m | Avg Time: 32m 52s | Hits:  17%/2824  
      🟩 clang15            Pass: 100%/4   | Total Time:  2h 12m | Avg Time: 33m 07s | Hits:  17%/2816  
      🟨 clang16            Pass:  71%/14  | Total Time:  9h 48m | Avg Time: 42m 02s | Hits:  16%/6748  
      🟩 gcc6               Pass: 100%/2   | Total Time: 48m 36s | Avg Time: 24m 18s | Hits:  11%/1256  
      🟩 gcc7               Pass: 100%/6   | Total Time:  2h 51m | Avg Time: 28m 35s | Hits:  14%/4005  
      🟩 gcc8               Pass: 100%/6   | Total Time:  2h 57m | Avg Time: 29m 37s | Hits:  14%/4005  
      🟩 gcc9               Pass: 100%/6   | Total Time:  3h 01m | Avg Time: 30m 13s | Hits:  14%/4005  
      🟩 gcc10              Pass: 100%/4   | Total Time:  2h 18m | Avg Time: 34m 41s | Hits:  16%/2824  
      🟩 gcc11              Pass: 100%/7   | Total Time:  4h 22m | Avg Time: 37m 33s | Hits:  16%/4928  
      🟨 gcc12              Pass:  75%/16  | Total Time: 10h 15m | Avg Time: 38m 26s | Hits:  16%/8448  
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  1h 49m | Avg Time: 36m 34s | Hits:  12%/1890  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 49m 30s | Avg Time: 49m 30s | Hits:  10%/549   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 28m | Avg Time: 44m 01s | Hits:  10%/1098  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 13m | Avg Time: 44m 38s | Hits:  10%/1647  
    🟨 cxx_name
      🟨 clang              Pass:  90%/43  | Total Time:  1d 01h | Avg Time: 35m 16s | Hits:  16%/26980 
      🟨 gcc                Pass:  91%/47  | Total Time:  1d 02h | Avg Time: 33m 57s | Hits:  15%/29471 
      🟩 Intel              Pass: 100%/3   | Total Time:  1h 49m | Avg Time: 36m 34s | Hits:  12%/1890  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 31m | Avg Time: 45m 14s | Hits:  10%/3294  
    🟨 gpu
      🟨 v100               Pass:  91%/99  | Total Time:  2d 10h | Avg Time: 35m 17s | Hits:  15%/61635 
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  2h 10m | Avg Time: 43m 30s | Hits:  16%/2112  
      🟩 90a                Pass: 100%/4   | Total Time:  1h 15m | Avg Time: 18m 52s | Hits:  16%/2816  
    🟨 std
      🟨 11                 Pass:  92%/26  | Total Time: 14h 21m | Avg Time: 33m 08s | Hits:  15%/16465 
      🟨 14                 Pass:  93%/29  | Total Time: 17h 14m | Avg Time: 35m 40s | Hits:  15%/18112 
      🟨 17                 Pass:  92%/28  | Total Time: 16h 22m | Avg Time: 35m 06s | Hits:  15%/17493 
      🟨 20                 Pass:  87%/16  | Total Time: 10h 14m | Avg Time: 38m 24s | Hits:  16%/9565  
    

🏃‍ Runner counts (total jobs: 198)

# Runner
154 linux-amd64-cpu16
16 linux-arm64-cpu16
16 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label May 23, 2024
@bernhardmgruber bernhardmgruber marked this pull request as ready for review May 24, 2024 10:49
@bernhardmgruber bernhardmgruber requested review from a team as code owners May 24, 2024 10:49
@bernhardmgruber bernhardmgruber marked this pull request as draft May 24, 2024 10:49
Copy link
Contributor

🟨 CI Results: Pass: 95%/198 | Total Time: 2d 22h | Avg Time: 21m 19s | Hits: 64%/112452
  • 🟩 thrust: Pass: 100%/99 | Total Time: 20h 06m | Avg Time: 12m 11s | Hits: 78%/50817

    🟩 cpu
      🟩 amd64              Pass: 100%/91  | Total Time: 19h 43m | Avg Time: 13m 00s | Hits:  76%/46709 
      🟩 arm64              Pass: 100%/8   | Total Time: 22m 33s | Avg Time:  2m 49s | Hits:  99%/4108  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total Time:  2h 15m | Avg Time:  9m 03s | Hits:  82%/7700  
      🟩 11.8               Pass: 100%/3   | Total Time:  9m 51s | Avg Time:  3m 17s | Hits:  99%/1542  
      🟩 12.4               Pass: 100%/81  | Total Time: 17h 40m | Avg Time: 13m 05s | Hits:  77%/41575 
    🟩 cudacxx_full
      🟩 clang-cuda16       Pass: 100%/2   | Total Time:  6m 22s | Avg Time:  3m 11s | Hits: 100%/1026  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  2h 15m | Avg Time:  9m 03s | Hits:  82%/7700  
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  9m 51s | Avg Time:  3m 17s | Hits:  99%/1542  
      🟩 nvcc12.4           Pass: 100%/79  | Total Time: 17h 34m | Avg Time: 13m 20s | Hits:  76%/40549 
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total Time:  6m 22s | Avg Time:  3m 11s | Hits: 100%/1026  
      🟩 nvcc               Pass: 100%/97  | Total Time: 20h 00m | Avg Time: 12m 22s | Hits:  78%/49791 
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  2h 13m | Avg Time: 22m 16s | Hits:  42%/3078  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 14m | Avg Time: 24m 46s | Hits:  42%/1539  
      🟩 clang11            Pass: 100%/4   | Total Time:  1h 37m | Avg Time: 24m 17s | Hits:  41%/2052  
      🟩 clang12            Pass: 100%/4   | Total Time:  1h 35m | Avg Time: 23m 56s | Hits:  41%/2052  
      🟩 clang13            Pass: 100%/4   | Total Time:  1h 34m | Avg Time: 23m 35s | Hits:  41%/2052  
      🟩 clang14            Pass: 100%/4   | Total Time: 12m 59s | Avg Time:  3m 14s | Hits: 100%/2052  
      🟩 clang15            Pass: 100%/4   | Total Time: 13m 01s | Avg Time:  3m 15s | Hits: 100%/2052  
      🟩 clang16            Pass: 100%/14  | Total Time:  1h 19m | Avg Time:  5m 41s | Hits: 100%/7182  
      🟩 gcc6               Pass: 100%/2   | Total Time:  5m 09s | Avg Time:  2m 34s | Hits:  99%/1026  
      🟩 gcc7               Pass: 100%/6   | Total Time: 18m 08s | Avg Time:  3m 01s | Hits:  99%/3084  
      🟩 gcc8               Pass: 100%/6   | Total Time: 17m 56s | Avg Time:  2m 59s | Hits:  99%/3084  
      🟩 gcc9               Pass: 100%/6   | Total Time: 17m 32s | Avg Time:  2m 55s | Hits:  99%/3084  
      🟩 gcc10              Pass: 100%/4   | Total Time: 12m 51s | Avg Time:  3m 12s | Hits:  99%/2056  
      🟩 gcc11              Pass: 100%/7   | Total Time: 48m 47s | Avg Time:  6m 58s | Hits:  87%/3598  
      🟩 gcc12              Pass: 100%/16  | Total Time:  1h 33m | Avg Time:  5m 49s | Hits:  99%/8224  
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  1h 36m | Avg Time: 32m 10s | Hits:  16%/1548  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 45m 28s | Avg Time: 45m 28s | Hits:  11%/509   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 36m | Avg Time: 48m 16s | Hits:  11%/1018  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 33m | Avg Time: 51m 05s | Hits:  11%/1527  
    🟩 cxx_name
      🟩 clang              Pass: 100%/43  | Total Time: 10h 00m | Avg Time: 13m 58s | Hits:  71%/22059 
      🟩 gcc                Pass: 100%/47  | Total Time:  3h 33m | Avg Time:  4m 32s | Hits:  97%/24156 
      🟩 Intel              Pass: 100%/3   | Total Time:  1h 36m | Avg Time: 32m 10s | Hits:  16%/1548  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 55m | Avg Time: 49m 13s | Hits:  11%/3054  
    🟩 gpu
      🟩 v100               Pass: 100%/99  | Total Time: 20h 06m | Avg Time: 12m 11s | Hits:  78%/50817 
    🟩 jobs
      🟩 build              Pass: 100%/91  | Total Time: 18h 20m | Avg Time: 12m 05s | Hits:  76%/46709 
      🟩 test               Pass: 100%/8   | Total Time:  1h 45m | Avg Time: 13m 14s | Hits:  99%/4108  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  1h 30m | Avg Time:  6m 27s | Hits:  87%/7191  
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time:  8h 09m | Avg Time: 13m 58s | Hits:  70%/17968 
      🟩 ubuntu22.04        Pass: 100%/44  | Total Time:  5h 31m | Avg Time:  7m 31s | Hits:  92%/22604 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 55m | Avg Time: 49m 13s | Hits:  11%/3054  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  9m 51s | Avg Time:  3m 17s | Hits:  99%/1542  
      🟩 90a                Pass: 100%/4   | Total Time: 11m 17s | Avg Time:  2m 49s | Hits:  99%/2056  
    🟩 std
      🟩 11                 Pass: 100%/26  | Total Time:  3h 50m | Avg Time:  8m 52s | Hits:  84%/13354 
      🟩 14                 Pass: 100%/29  | Total Time:  6h 39m | Avg Time: 13m 47s | Hits:  75%/14881 
      🟩 17                 Pass: 100%/28  | Total Time:  6h 02m | Avg Time: 12m 56s | Hits:  77%/14372 
      🟩 20                 Pass: 100%/16  | Total Time:  3h 33m | Avg Time: 13m 20s | Hits:  78%/8210  
    
  • 🟨 cub: Pass: 91%/99 | Total Time: 2d 02h | Avg Time: 30m 28s | Hits: 53%/61635

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  91%/91  | Total Time:  1d 22h | Avg Time: 30m 40s | Hits:  52%/56003 
      🟩 arm64              Pass: 100%/8   | Total Time:  3h 45m | Avg Time: 28m 10s | Hits:  57%/5632  
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total Time:  6h 23m | Avg Time: 25m 32s | Hits:  50%/9350  
      🟩 11.8               Pass: 100%/3   | Total Time:  1h 53m | Avg Time: 37m 53s | Hits:  57%/2112  
      🔍 12.4               Pass:  90%/81  | Total Time:  1d 17h | Avg Time: 31m 06s | Hits:  53%/50173 
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda16       Pass: 100%/2   | Total Time: 38m 21s | Avg Time: 19m 10s | Hits:  59%/1116  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  6h 23m | Avg Time: 25m 32s | Hits:  50%/9350  
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  1h 53m | Avg Time: 37m 53s | Hits:  57%/2112  
      🔍 nvcc12.4           Pass:  89%/79  | Total Time:  1d 17h | Avg Time: 31m 24s | Hits:  53%/49057 
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total Time: 38m 21s | Avg Time: 19m 10s | Hits:  59%/1116  
      🔍 nvcc               Pass:  91%/97  | Total Time:  2d 01h | Avg Time: 30m 42s | Hits:  53%/60519 
    🚨 jobs: test 🚨
      🟩 build              Pass: 100%/91  | Total Time:  1d 19h | Avg Time: 28m 35s | Hits:  53%/61635 
      🔥 test               Pass:   0%/8   | Total Time:  6h 55m | Avg Time: 51m 53s
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  5h 35m | Avg Time: 23m 59s | Hits:  53%/8801  
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time: 16h 27m | Avg Time: 28m 12s | Hits:  57%/24710 
      🔍 ubuntu22.04        Pass:  81%/44  | Total Time: 23h 43m | Avg Time: 32m 21s | Hits:  54%/24830 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 29m | Avg Time: 44m 57s | Hits:  10%/3294  
    🟨 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  2h 32m | Avg Time: 25m 27s | Hits:  55%/4002  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 23m | Avg Time: 27m 43s | Hits:  58%/2118  
      🟩 clang11            Pass: 100%/4   | Total Time:  1h 51m | Avg Time: 27m 54s | Hits:  58%/2824  
      🟩 clang12            Pass: 100%/4   | Total Time:  1h 48m | Avg Time: 27m 10s | Hits:  58%/2824  
      🟩 clang13            Pass: 100%/4   | Total Time:  1h 49m | Avg Time: 27m 23s | Hits:  58%/2824  
      🟩 clang14            Pass: 100%/4   | Total Time:  1h 49m | Avg Time: 27m 24s | Hits:  58%/2824  
      🟩 clang15            Pass: 100%/4   | Total Time:  1h 52m | Avg Time: 28m 03s | Hits:  57%/2816  
      🟨 clang16            Pass:  71%/14  | Total Time:  7h 52m | Avg Time: 33m 47s | Hits:  58%/6748  
      🟩 gcc6               Pass: 100%/2   | Total Time: 47m 58s | Avg Time: 23m 59s | Hits:  53%/1256  
      🟩 gcc7               Pass: 100%/6   | Total Time:  2h 39m | Avg Time: 26m 31s | Hits:  55%/4005  
      🟩 gcc8               Pass: 100%/6   | Total Time:  2h 40m | Avg Time: 26m 40s | Hits:  55%/4005  
      🟩 gcc9               Pass: 100%/6   | Total Time:  2h 44m | Avg Time: 27m 26s | Hits:  55%/4005  
      🟩 gcc10              Pass: 100%/4   | Total Time:  1h 55m | Avg Time: 28m 54s | Hits:  57%/2824  
      🟩 gcc11              Pass: 100%/7   | Total Time:  3h 47m | Avg Time: 32m 32s | Hits:  57%/4928  
      🟨 gcc12              Pass:  75%/16  | Total Time:  8h 22m | Avg Time: 31m 25s | Hits:  57%/8448  
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  1h 48m | Avg Time: 36m 00s | Hits:  12%/1890  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 47m 12s | Avg Time: 47m 12s | Hits:  10%/549   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 26m | Avg Time: 43m 27s | Hits:  10%/1098  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 15m | Avg Time: 45m 11s | Hits:  10%/1647  
    🟨 cxx_name
      🟨 clang              Pass:  90%/43  | Total Time: 21h 00m | Avg Time: 29m 19s | Hits:  57%/26980 
      🟨 gcc                Pass:  91%/47  | Total Time: 22h 58m | Avg Time: 29m 19s | Hits:  56%/29471 
      🟩 Intel              Pass: 100%/3   | Total Time:  1h 48m | Avg Time: 36m 00s | Hits:  12%/1890  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 29m | Avg Time: 44m 57s | Hits:  10%/3294  
    🟨 gpu
      🟨 v100               Pass:  91%/99  | Total Time:  2d 02h | Avg Time: 30m 28s | Hits:  53%/61635 
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  1h 53m | Avg Time: 37m 53s | Hits:  57%/2112  
      🟩 90a                Pass: 100%/4   | Total Time:  1h 03m | Avg Time: 15m 49s | Hits:  57%/2816  
    🟨 std
      🟨 11                 Pass:  92%/26  | Total Time: 12h 39m | Avg Time: 29m 12s | Hits:  55%/16465 
      🟨 14                 Pass:  93%/29  | Total Time: 15h 15m | Avg Time: 31m 34s | Hits:  51%/18112 
      🟨 17                 Pass:  92%/28  | Total Time: 14h 02m | Avg Time: 30m 06s | Hits:  52%/17493 
      🟨 20                 Pass:  87%/16  | Total Time:  8h 18m | Avg Time: 31m 09s | Hits:  55%/9565  
    

🏃‍ Runner counts (total jobs: 198)

# Runner
154 linux-amd64-cpu16
16 linux-arm64-cpu16
16 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

@bernhardmgruber bernhardmgruber marked this pull request as ready for review May 28, 2024 08:34
@bernhardmgruber bernhardmgruber enabled auto-merge (squash) May 28, 2024 10:01
Copy link
Contributor

🟩 CI Results: Pass: 100%/198 | Total Time: 4d 09h | Avg Time: 32m 05s | Hits: 22%/132439
  • 🟩 thrust: Pass: 100%/99 | Total Time: 1d 20h | Avg Time: 27m 03s | Hits: 22%/50817

    🟩 cpu
      🟩 amd64              Pass: 100%/91  | Total Time:  1d 16h | Avg Time: 26m 54s | Hits:  22%/46709 
      🟩 arm64              Pass: 100%/8   | Total Time:  3h 49m | Avg Time: 28m 44s | Hits:  15%/4108  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total Time:  6h 10m | Avg Time: 24m 40s | Hits:  15%/7700  
      🟩 11.8               Pass: 100%/3   | Total Time:  1h 40m | Avg Time: 33m 36s | Hits:  15%/1542  
      🟩 12.4               Pass: 100%/81  | Total Time:  1d 12h | Avg Time: 27m 15s | Hits:  23%/41575 
    🟩 cudacxx_full
      🟩 clang-cuda16       Pass: 100%/2   | Total Time: 48m 38s | Avg Time: 24m 19s | Hits:  13%/1026  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  6h 10m | Avg Time: 24m 40s | Hits:  15%/7700  
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  1h 40m | Avg Time: 33m 36s | Hits:  15%/1542  
      🟩 nvcc12.4           Pass: 100%/79  | Total Time:  1d 11h | Avg Time: 27m 19s | Hits:  23%/40549 
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total Time: 48m 38s | Avg Time: 24m 19s | Hits:  13%/1026  
      🟩 nvcc               Pass: 100%/97  | Total Time:  1d 19h | Avg Time: 27m 06s | Hits:  22%/49791 
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  2h 31m | Avg Time: 25m 17s | Hits:  15%/3078  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 22m | Avg Time: 27m 39s | Hits:  15%/1539  
      🟩 clang11            Pass: 100%/4   | Total Time:  1h 42m | Avg Time: 25m 33s | Hits:  15%/2052  
      🟩 clang12            Pass: 100%/4   | Total Time:  1h 48m | Avg Time: 27m 11s | Hits:  15%/2052  
      🟩 clang13            Pass: 100%/4   | Total Time:  1h 51m | Avg Time: 27m 50s | Hits:  15%/2052  
      🟩 clang14            Pass: 100%/4   | Total Time:  1h 44m | Avg Time: 26m 05s | Hits:  15%/2052  
      🟩 clang15            Pass: 100%/4   | Total Time:  1h 49m | Avg Time: 27m 16s | Hits:  15%/2052  
      🟩 clang16            Pass: 100%/14  | Total Time:  5h 11m | Avg Time: 22m 15s | Hits:  39%/7182  
      🟩 gcc6               Pass: 100%/2   | Total Time: 48m 23s | Avg Time: 24m 11s | Hits:  16%/1026  
      🟩 gcc7               Pass: 100%/6   | Total Time:  2h 27m | Avg Time: 24m 32s | Hits:  15%/3084  
      🟩 gcc8               Pass: 100%/6   | Total Time:  2h 39m | Avg Time: 26m 31s | Hits:  15%/3084  
      🟩 gcc9               Pass: 100%/6   | Total Time:  2h 37m | Avg Time: 26m 18s | Hits:  15%/3084  
      🟩 gcc10              Pass: 100%/4   | Total Time:  1h 55m | Avg Time: 28m 51s | Hits:  15%/2056  
      🟩 gcc11              Pass: 100%/7   | Total Time:  3h 29m | Avg Time: 29m 57s | Hits:  15%/3598  
      🟩 gcc12              Pass: 100%/16  | Total Time:  5h 43m | Avg Time: 21m 29s | Hits:  36%/8224  
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  2h 02m | Avg Time: 40m 54s | Hits:  16%/1548  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 45m 27s | Avg Time: 45m 27s | Hits:  11%/509   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 37m | Avg Time: 48m 45s | Hits:  11%/1018  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 28m | Avg Time: 49m 37s | Hits:  11%/1527  
    🟩 cxx_name
      🟩 clang              Pass: 100%/43  | Total Time: 18h 02m | Avg Time: 25m 10s | Hits:  23%/22059 
      🟩 gcc                Pass: 100%/47  | Total Time: 19h 41m | Avg Time: 25m 08s | Hits:  22%/24156 
      🟩 Intel              Pass: 100%/3   | Total Time:  2h 02m | Avg Time: 40m 54s | Hits:  16%/1548  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 51m | Avg Time: 48m 38s | Hits:  11%/3054  
    🟩 gpu
      🟩 v100               Pass: 100%/99  | Total Time:  1d 20h | Avg Time: 27m 03s | Hits:  22%/50817 
    🟩 jobs
      🟩 build              Pass: 100%/91  | Total Time:  1d 18h | Avg Time: 28m 17s | Hits:  15%/46709 
      🟩 test               Pass: 100%/8   | Total Time:  1h 42m | Avg Time: 12m 52s | Hits:  99%/4108  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  5h 24m | Avg Time: 23m 11s | Hits:  15%/7191  
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time: 16h 04m | Avg Time: 27m 33s | Hits:  15%/17968 
      🟩 ubuntu22.04        Pass: 100%/44  | Total Time: 18h 16m | Avg Time: 24m 55s | Hits:  30%/22604 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 51m | Avg Time: 48m 38s | Hits:  11%/3054  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  1h 40m | Avg Time: 33m 36s | Hits:  15%/1542  
      🟩 90a                Pass: 100%/4   | Total Time:  1h 01m | Avg Time: 15m 19s | Hits:  15%/2056  
    🟩 std
      🟩 11                 Pass: 100%/26  | Total Time:  9h 39m | Avg Time: 22m 18s | Hits:  26%/13354 
      🟩 14                 Pass: 100%/29  | Total Time: 14h 22m | Avg Time: 29m 43s | Hits:  19%/14881 
      🟩 17                 Pass: 100%/28  | Total Time: 13h 17m | Avg Time: 28m 28s | Hits:  19%/14372 
      🟩 20                 Pass: 100%/16  | Total Time:  7h 18m | Avg Time: 27m 25s | Hits:  24%/8210  
    
  • 🟩 cub: Pass: 100%/99 | Total Time: 2d 13h | Avg Time: 37m 07s | Hits: 23%/81622

    🟩 cpu
      🟩 amd64              Pass: 100%/91  | Total Time:  2d 08h | Avg Time: 37m 06s | Hits:  23%/74830 
      🟩 arm64              Pass: 100%/8   | Total Time:  4h 57m | Avg Time: 37m 11s | Hits:  17%/6792  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total Time:  7h 30m | Avg Time: 30m 02s | Hits:  12%/11525 
      🟩 11.8               Pass: 100%/3   | Total Time:  2h 07m | Avg Time: 42m 34s | Hits:  16%/2547  
      🟩 12.4               Pass: 100%/81  | Total Time:  2d 03h | Avg Time: 38m 13s | Hits:  25%/67550 
    🟩 cudacxx_full
      🟩 clang-cuda16       Pass: 100%/2   | Total Time: 53m 20s | Avg Time: 26m 40s | Hits:  13%/1406  
      🟩 nvcc11.1           Pass: 100%/15  | Total Time:  7h 30m | Avg Time: 30m 02s | Hits:  12%/11525 
      🟩 nvcc11.8           Pass: 100%/3   | Total Time:  2h 07m | Avg Time: 42m 34s | Hits:  16%/2547  
      🟩 nvcc12.4           Pass: 100%/79  | Total Time:  2d 02h | Avg Time: 38m 31s | Hits:  25%/66144 
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total Time: 53m 20s | Avg Time: 26m 40s | Hits:  13%/1406  
      🟩 nvcc               Pass: 100%/97  | Total Time:  2d 12h | Avg Time: 37m 20s | Hits:  23%/80216 
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total Time:  3h 06m | Avg Time: 31m 07s | Hits:  15%/4872  
      🟩 clang10            Pass: 100%/3   | Total Time:  1h 39m | Avg Time: 33m 04s | Hits:  17%/2553  
      🟩 clang11            Pass: 100%/4   | Total Time:  2h 09m | Avg Time: 32m 19s | Hits:  17%/3404  
      🟩 clang12            Pass: 100%/4   | Total Time:  2h 18m | Avg Time: 34m 44s | Hits:  17%/3404  
      🟩 clang13            Pass: 100%/4   | Total Time:  2h 38m | Avg Time: 39m 41s | Hits:  17%/3404  
      🟩 clang14            Pass: 100%/4   | Total Time:  2h 15m | Avg Time: 33m 54s | Hits:  17%/3404  
      🟩 clang15            Pass: 100%/4   | Total Time:  2h 43m | Avg Time: 40m 55s | Hits:  17%/3396  
      🟩 clang16            Pass: 100%/14  | Total Time:  9h 50m | Avg Time: 42m 10s | Hits:  41%/11594 
      🟩 gcc6               Pass: 100%/2   | Total Time: 55m 56s | Avg Time: 27m 58s | Hits:  12%/1546  
      🟩 gcc7               Pass: 100%/6   | Total Time:  3h 00m | Avg Time: 30m 09s | Hits:  15%/4875  
      🟩 gcc8               Pass: 100%/6   | Total Time:  3h 16m | Avg Time: 32m 47s | Hits:  15%/4875  
      🟩 gcc9               Pass: 100%/6   | Total Time:  3h 04m | Avg Time: 30m 42s | Hits:  15%/4875  
      🟩 gcc10              Pass: 100%/4   | Total Time:  2h 27m | Avg Time: 36m 53s | Hits:  17%/3404  
      🟩 gcc11              Pass: 100%/7   | Total Time:  4h 24m | Avg Time: 37m 49s | Hits:  16%/5943  
      🟩 gcc12              Pass: 100%/16  | Total Time: 10h 47m | Avg Time: 40m 26s | Hits:  37%/13584 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total Time:  1h 54m | Avg Time: 38m 11s | Hits:  13%/2325  
      🟩 MSVC14.16          Pass: 100%/1   | Total Time: 53m 01s | Avg Time: 53m 01s | Hits:  11%/694   
      🟩 MSVC14.29          Pass: 100%/2   | Total Time:  1h 31m | Avg Time: 45m 34s | Hits:  11%/1388  
      🟩 MSVC14.39          Pass: 100%/3   | Total Time:  2h 16m | Avg Time: 45m 20s | Hits:  11%/2082  
    🟩 cxx_name
      🟩 clang              Pass: 100%/43  | Total Time:  1d 02h | Avg Time: 37m 16s | Hits:  24%/36031 
      🟩 gcc                Pass: 100%/47  | Total Time:  1d 03h | Avg Time: 35m 41s | Hits:  23%/39102 
      🟩 Intel              Pass: 100%/3   | Total Time:  1h 54m | Avg Time: 38m 11s | Hits:  13%/2325  
      🟩 MSVC               Pass: 100%/6   | Total Time:  4h 40m | Avg Time: 46m 41s | Hits:  11%/4164  
    🟩 gpu
      🟩 v100               Pass: 100%/99  | Total Time:  2d 13h | Avg Time: 37m 07s | Hits:  23%/81622 
    🟩 jobs
      🟩 build              Pass: 100%/91  | Total Time:  2d 04h | Avg Time: 34m 30s | Hits:  16%/74830 
      🟩 test               Pass: 100%/8   | Total Time:  8h 55m | Avg Time:  1h 06m | Hits:  99%/6792  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total Time:  6h 37m | Avg Time: 28m 24s | Hits:  12%/10831 
      🟩 ubuntu20.04        Pass: 100%/35  | Total Time: 20h 16m | Avg Time: 34m 45s | Hits:  17%/29785 
      🟩 ubuntu22.04        Pass: 100%/44  | Total Time:  1d 05h | Avg Time: 40m 28s | Hits:  31%/36842 
      🟩 windows2022        Pass: 100%/6   | Total Time:  4h 40m | Avg Time: 46m 41s | Hits:  11%/4164  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total Time:  2h 07m | Avg Time: 42m 34s | Hits:  16%/2547  
      🟩 90a                Pass: 100%/4   | Total Time:  1h 14m | Avg Time: 18m 35s | Hits:  16%/3396  
    🟩 std
      🟩 11                 Pass: 100%/26  | Total Time: 15h 23m | Avg Time: 35m 31s | Hits:  22%/21643 
      🟩 14                 Pass: 100%/29  | Total Time: 17h 54m | Avg Time: 37m 02s | Hits:  21%/23725 
      🟩 17                 Pass: 100%/28  | Total Time: 17h 32m | Avg Time: 37m 34s | Hits:  22%/22961 
      🟩 20                 Pass: 100%/16  | Total Time: 10h 24m | Avg Time: 39m 01s | Hits:  27%/13293 
    

🏃‍ Runner counts (total jobs: 198)

# Runner
154 linux-amd64-cpu16
16 linux-arm64-cpu16
16 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

@bernhardmgruber bernhardmgruber merged commit 79f7ae4 into NVIDIA:main May 28, 2024
6 checks passed
@bernhardmgruber bernhardmgruber deleted the arraywrapper branch May 28, 2024 13:06
@pauleonix
Copy link
Contributor

I'm a bit late to the party, but I think there should be a better way than to use .__elems_ (or other hacky workarounds?) to actually use std::cuda::array with CUB algorithms as a user. Should I open a new issue for that?

@bernhardmgruber
Copy link
Contributor Author

there should be a better way than to use .__elems_ (or other hacky workarounds?) to actually use std::cuda::array with CUB algorithms as a user. Should I open a new issue for that?

I fully agree. But I don't see where you would need to use this combination as a user of CUB. Could you please provide an example where you are forced to do that? You can do this as an issue if you like to!

@pauleonix
Copy link
Contributor

pauleonix commented Jun 18, 2024

@bernhardmgruber Basically all the CUB block algorithms take C-style arrays in their API. At the moment I use my own array wrapper because I did not know about CUB's built-in one and I figured that I can't use cuda::std::array because the standard API does not allow direct access to the array member.

As one can't return C-style arrays from functions, I think the cleanest solution would be to let CUB's API take cuda::std::array& directly, but that would involve a lot of additional boilerplate code (on the library side) that then just calls the old API with .__elems_.

@bernhardmgruber
Copy link
Contributor Author

At the moment I use my own array wrapper because I did not know about CUB's built-in one

So I conclude that this PR is no breaking change for you then?

I think the cleanest solution would be to let CUB's API take cuda::std::array& directly

Ah, I understand where you are getting at. You have a feature request for CUB APIs to take cuda::std::array instead of native arrays? That definitely needs a separate issue so we can discuss, since it touches a large part of the library. Please open an issue, thank you!

@pauleonix
Copy link
Contributor

pauleonix commented Jun 18, 2024

@bernhardmgruber

So I conclude that this PR is no breaking change for you then?

It isn't for me personally but I posted here because it is breaking for people that do use cub::ArrayWrapper with CUB's block algorithms (maybe they don't exist 🤷‍♂️) because cuda::std::array provides a reduced feature set when one doesn't want to use implementation details. Using implementation details from libcu++ in CUB might be fine as both are part of CCCL, but for users I would have expected a better solution when advertising cuda::std::array as replacement for a public cub::ArrayWrapper that provides direct access to array. I will post an issue.

@bernhardmgruber
Copy link
Contributor Author

it is breaking for people that do use cub::ArrayWrapper with CUB's block algorithms

AFAIK, CUB's block algorithms take references to native arrays, so users can use any type to supply those. I agree that cuda::std::array is not a good choice, because you would rely on an implementation detail to get the native array.

for users I would have expected a better solution when advertising cuda::std::array as replacement for a public cub::ArrayWrapper that provides direct access to array.

I see cub::ArrayWrapper as a means to pass native arrays by value to a kernel. cuda::std::array 100% fulfills this role.

IIUC you are asking about the combination of both use cases: Using the same type to pass arrays by value to a kernel and then using the same type+instance again to pass a native array inside the kernel to a CUB block algortihm? This use case is indeed affected.

@pauleonix
Copy link
Contributor

pauleonix commented Jun 18, 2024

@bernhardmgruber Ok, I was not aware that passing arrays to kernels was the main use case of ArrayWrapper. Given that it is only used inside CUB to interface with device functions I considered interfacing with CUB's block algorithms to be its main use case or at least one of its main use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants