Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid ::result_type for partial sums in TBB reduce_by_key #1998

Merged
merged 5 commits into from
Jul 26, 2024

Conversation

bernhardmgruber
Copy link
Contributor

This allows us to get rid of partial_sum_type in the TBB backend for reduce_by_key with no iniital value, which still uses the C++11-deprecated function object API ::result_type.

This is potentially a breaking change, because users could, e.g., reduce a large sequence of int8 using a binary function advertising a ::result_type of int64, which would now accumulate on an int8 variable, instead of int64. However, this seems to be the current behavior of the CUDA backend anyway, which always takes the iterator's value_type instead of querying the return type of the binary reduction function.

@bernhardmgruber bernhardmgruber added the thrust For all items related to Thrust. label Jul 17, 2024
@bernhardmgruber bernhardmgruber marked this pull request as ready for review July 17, 2024 12:07
@bernhardmgruber bernhardmgruber requested review from a team as code owners July 17, 2024 12:07
Copy link
Contributor

🟨 CI finished in 3h 33m: Pass: 99%/250 | Total: 1d 11h | Avg: 8m 31s | Max: 42m 13s | Hits: 90%/247487
  • 🟨 cub: Pass: 99%/131 | Total: 19h 50m | Avg: 9m 05s | Max: 42m 13s | Hits: 99%/108575

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total: 19h 12m | Avg:  9m 22s | Max: 42m 13s | Hits:  99%/101743
      🟩 arm64              Pass: 100%/8   | Total: 38m 17s | Avg:  4m 47s | Max:  5m 09s | Hits:  99%/6832  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 05m | Avg:  4m 20s | Max: 12m 08s | Hits:  99%/11598 
      🟩 11.8               Pass: 100%/3   | Total: 14m 05s | Avg:  4m 41s | Max:  5m 18s | Hits:  99%/2562  
      🔍 12.5               Pass:  99%/113 | Total: 18h 31m | Avg:  9m 50s | Max: 42m 13s | Hits:  99%/94415 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 25s | Avg:  3m 42s | Max:  3m 43s | Hits:  99%/1412  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 05m | Avg:  4m 20s | Max: 12m 08s | Hits:  99%/11598 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 14m 05s | Avg:  4m 41s | Max:  5m 18s | Hits:  99%/2562  
      🔍 nvcc12.5           Pass:  99%/111 | Total: 18h 24m | Avg:  9m 56s | Max: 42m 13s | Hits:  99%/93003 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 25s | Avg:  3m 42s | Max:  3m 43s | Hits:  99%/1412  
      🔍 nvcc               Pass:  99%/129 | Total: 19h 43m | Avg:  9m 10s | Max: 42m 13s | Hits:  99%/107163
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total: 26m 43s | Avg:  4m 27s | Max:  5m 12s | Hits: 100%/4902  
      🟩 Clang10            Pass: 100%/3   | Total: 15m 30s | Avg:  5m 10s | Max:  5m 13s | Hits:  99%/2568  
      🟩 Clang11            Pass: 100%/4   | Total: 17m 50s | Avg:  4m 27s | Max:  4m 35s | Hits: 100%/3424  
      🟩 Clang12            Pass: 100%/4   | Total: 17m 35s | Avg:  4m 23s | Max:  4m 41s | Hits: 100%/3424  
      🟩 Clang13            Pass: 100%/4   | Total: 17m 44s | Avg:  4m 26s | Max:  4m 32s | Hits:  99%/3424  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 39s | Avg:  4m 24s | Max:  4m 37s | Hits:  99%/3424  
      🟩 Clang15            Pass: 100%/4   | Total: 18m 33s | Avg:  4m 38s | Max:  4m 50s | Hits:  99%/3416  
      🟩 Clang16            Pass: 100%/4   | Total: 17m 42s | Avg:  4m 25s | Max:  4m 32s | Hits: 100%/3416  
      🟩 Clang17            Pass: 100%/26  | Total:  6h 58m | Avg: 16m 06s | Max: 34m 05s | Hits:  99%/21908 
      🟩 GCC6               Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  3m 42s | Hits:  99%/1556  
      🟩 GCC7               Pass: 100%/6   | Total: 24m 20s | Avg:  4m 03s | Max:  4m 34s | Hits:  99%/4905  
      🟩 GCC8               Pass: 100%/6   | Total: 24m 18s | Avg:  4m 03s | Max:  4m 26s | Hits:  99%/4905  
      🟩 GCC9               Pass: 100%/6   | Total: 24m 20s | Avg:  4m 03s | Max:  4m 33s | Hits:  99%/4905  
      🟩 GCC10              Pass: 100%/4   | Total: 17m 14s | Avg:  4m 18s | Max:  4m 34s | Hits:  99%/3424  
      🟩 GCC11              Pass: 100%/7   | Total: 31m 45s | Avg:  4m 32s | Max:  5m 18s | Hits:  99%/5978  
      🟩 GCC12              Pass: 100%/4   | Total: 17m 50s | Avg:  4m 27s | Max:  4m 35s | Hits:  99%/3416  
      🔍 GCC13              Pass:  96%/28  | Total:  6h 35m | Avg: 14m 07s | Max: 42m 13s | Hits:  98%/23058 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 24s | Avg:  5m 08s | Max:  5m 12s | Hits: 100%/2340  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 08s | Avg: 12m 08s | Max: 12m 08s | Hits:  98%/697   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 19m 52s | Avg:  9m 56s | Max: 10m 09s | Hits:  98%/1394  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 32m 32s | Avg: 10m 50s | Max: 11m 14s | Hits:  98%/2091  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/59  | Total:  9h 28m | Avg:  9m 37s | Max: 34m 05s | Hits:  99%/49906 
      🔍 GCC                Pass:  98%/63  | Total:  9h 02m | Avg:  8m 36s | Max: 42m 13s | Hits:  98%/52147 
      🟩 Intel              Pass: 100%/3   | Total: 15m 24s | Avg:  5m 08s | Max:  5m 12s | Hits: 100%/2340  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 04m | Avg: 10m 45s | Max: 12m 08s | Hits:  98%/4182  
    🔍 jobs: GraphCapture 🔍
      🟩 Build              Pass: 100%/99  | Total:  7h 53m | Avg:  4m 47s | Max: 12m 08s | Hits:  99%/82101 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 32m | Avg: 19m 01s | Max: 31m 04s | Hits:  99%/6832  
      🔍 GraphCapture       Pass:  87%/8   | Total:  1h 57m | Avg: 14m 39s | Max: 27m 13s | Hits:  99%/5978  
      🟩 HostLaunch         Pass: 100%/8   | Total:  3h 27m | Avg: 25m 59s | Max: 42m 13s | Hits:  94%/6832  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 59m | Avg: 29m 56s | Max: 38m 10s | Hits:  99%/6832  
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/34  | Total:  4h 52m | Avg:  8m 36s | Max: 35m 39s | Hits:  99%/28605 
      🟩 14                 Pass: 100%/37  | Total:  5h 36m | Avg:  9m 05s | Max: 38m 10s | Hits:  99%/30696 
      🟩 17                 Pass: 100%/36  | Total:  5h 29m | Avg:  9m 08s | Max: 42m 13s | Hits:  98%/29927 
      🔍 20                 Pass:  95%/24  | Total:  3h 52m | Avg:  9m 41s | Max: 31m 04s | Hits:  99%/19347 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total: 19h 50m | Avg:  9m 05s | Max: 42m 13s | Hits:  99%/108575
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 14m 05s | Avg:  4m 41s | Max:  5m 18s | Hits:  99%/2562  
      🟩 90a                Pass: 100%/4   | Total: 14m 36s | Avg:  3m 39s | Max:  3m 51s | Hits:  99%/3416  
    
  • 🟩 thrust: Pass: 100%/118 | Total: 15h 26m | Avg: 7m 51s | Max: 31m 46s | Hits: 84%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 14h 37m | Avg:  7m 58s | Max: 31m 46s | Hits:  84%/129492
      🟩 arm64              Pass: 100%/8   | Total: 49m 16s | Avg:  6m 09s | Max:  6m 27s | Hits:  80%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 32m | Avg:  6m 08s | Max: 17m 49s | Hits:  80%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 17m 54s | Avg:  5m 58s | Max:  6m 11s | Hits:  80%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 13h 36m | Avg:  8m 09s | Max: 31m 46s | Hits:  84%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 12m 03s | Avg:  6m 01s | Max:  6m 04s | Hits:  80%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 32m | Avg:  6m 08s | Max: 17m 49s | Hits:  80%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 54s | Avg:  5m 58s | Max:  6m 11s | Hits:  80%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total: 13h 24m | Avg:  8m 12s | Max: 31m 46s | Hits:  84%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 12m 03s | Avg:  6m 01s | Max:  6m 04s | Hits:  80%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 15h 14m | Avg:  7m 53s | Max: 31m 46s | Hits:  84%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 35m 16s | Avg:  5m 52s | Max:  6m 35s | Hits:  80%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 19m 23s | Avg:  6m 27s | Max:  6m 41s | Hits:  80%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 23m 26s | Avg:  5m 51s | Max:  6m 27s | Hits:  80%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 23m 29s | Avg:  5m 52s | Max:  6m 23s | Hits:  80%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 23m 20s | Avg:  5m 50s | Max:  5m 58s | Hits:  80%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 23m 34s | Avg:  5m 53s | Max:  6m 12s | Hits:  80%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 24m 11s | Avg:  6m 02s | Max:  6m 22s | Hits:  80%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 23m 56s | Avg:  5m 59s | Max:  6m 12s | Hits:  80%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 28m | Avg:  8m 16s | Max: 17m 31s | Hits:  89%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 10m 22s | Avg:  5m 11s | Max:  5m 22s | Hits:  80%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 31m 32s | Avg:  5m 15s | Max:  5m 34s | Hits:  80%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 51m 00s | Avg:  8m 30s | Max: 23m 49s | Hits:  75%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 33m 39s | Avg:  5m 36s | Max:  6m 05s | Hits:  80%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 50m 04s | Avg: 12m 31s | Max: 31m 46s | Hits:  71%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 35m 05s | Avg:  5m 00s | Max:  6m 11s | Hits:  88%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 26m 10s | Avg:  6m 32s | Max:  6m 46s | Hits:  80%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 47m | Avg:  8m 21s | Max: 31m 04s | Hits:  92%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 26m 33s | Avg:  8m 51s | Max:  9m 19s | Hits:  80%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 17m 49s | Avg: 17m 49s | Max: 17m 49s | Hits:  79%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 31m 06s | Avg: 15m 33s | Max: 15m 49s | Hits:  79%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 40m | Avg: 16m 43s | Max: 18m 34s | Hits:  89%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  5h 45m | Avg:  6m 46s | Max: 17m 31s | Hits:  83%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  6h 45m | Avg:  7m 22s | Max: 31m 46s | Hits:  84%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 26m 33s | Avg:  8m 51s | Max:  9m 19s | Hits:  80%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 29m | Avg: 16m 35s | Max: 18m 34s | Hits:  85%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 15h 26m | Avg:  7m 51s | Max: 31m 46s | Hits:  84%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 11h 22m | Avg:  6m 53s | Max: 31m 46s | Hits:  81%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 41m | Avg:  9m 15s | Max: 18m 34s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 22m | Avg: 17m 47s | Max: 31m 04s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 54s | Avg:  5m 58s | Max:  6m 11s | Hits:  80%/3534  
      🟩 90a                Pass: 100%/4   | Total: 22m 31s | Avg:  5m 37s | Max:  5m 55s | Hits:  80%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 24m | Avg:  6m 49s | Max: 23m 49s | Hits:  83%/35328 
      🟩 14                 Pass: 100%/34  | Total:  4h 37m | Avg:  8m 10s | Max: 31m 04s | Hits:  84%/40020 
      🟩 17                 Pass: 100%/33  | Total:  4h 04m | Avg:  7m 25s | Max: 18m 34s | Hits:  84%/38847 
      🟩 20                 Pass: 100%/21  | Total:  3h 19m | Avg:  9m 28s | Max: 31m 46s | Hits:  84%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 51s | Avg: 11m 51s | Max: 11m 51s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟨 CI finished in 3h 52m: Pass: 99%/250 | Total: 5d 06h | Avg: 30m 17s | Max: 1h 09m | Hits: 39%/247487
  • 🟨 cub: Pass: 99%/131 | Total: 2d 21h | Avg: 31m 48s | Max: 1h 09m | Hits: 48%/108575

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total:  2d 16h | Avg: 31m 33s | Max:  1h 09m | Hits:  49%/101743
      🟩 arm64              Pass: 100%/8   | Total:  4h 46m | Avg: 35m 50s | Max: 37m 55s | Hits:  31%/6832  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  7h 58m | Avg: 31m 52s | Max: 46m 05s | Hits:  51%/11598 
      🟩 11.8               Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🔍 12.5               Pass:  99%/113 | Total:  2d 11h | Avg: 31m 26s | Max:  1h 09m | Hits:  47%/94415 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 38s | Avg: 23m 49s | Max: 23m 55s | Hits:  56%/1412  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  7h 58m | Avg: 31m 52s | Max: 46m 05s | Hits:  51%/11598 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🔍 nvcc12.5           Pass:  99%/111 | Total:  2d 10h | Avg: 31m 34s | Max:  1h 09m | Hits:  47%/93003 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 38s | Avg: 23m 49s | Max: 23m 55s | Hits:  56%/1412  
      🔍 nvcc               Pass:  99%/129 | Total:  2d 20h | Avg: 31m 56s | Max:  1h 09m | Hits:  48%/107163
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  3h 09m | Avg: 31m 33s | Max: 35m 04s | Hits:  30%/4902  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 48m | Avg: 36m 11s | Max: 38m 40s | Hits:  10%/2568  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 17m | Avg: 34m 20s | Max: 35m 33s | Hits:  10%/3424  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 22m | Avg: 35m 44s | Max: 38m 03s | Hits:  10%/3424  
      🟩 Clang13            Pass: 100%/4   | Total:  2h 18m | Avg: 34m 43s | Max: 37m 12s | Hits:  10%/3424  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 19m | Avg: 34m 57s | Max: 37m 24s | Hits:  10%/3424  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 17m | Avg: 34m 17s | Max: 35m 56s | Hits:  52%/3416  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 21m | Avg: 35m 19s | Max: 36m 57s | Hits:  52%/3416  
      🟩 Clang17            Pass: 100%/26  | Total: 11h 38m | Avg: 26m 52s | Max:  1h 09m | Hits:  82%/21908 
      🟩 GCC6               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 53s | Max: 34m 32s | Hits:  51%/1556  
      🟩 GCC7               Pass: 100%/6   | Total:  3h 24m | Avg: 34m 03s | Max: 38m 17s | Hits:  29%/4905  
      🟩 GCC8               Pass: 100%/6   | Total:  3h 17m | Avg: 32m 57s | Max: 37m 24s | Hits:  29%/4905  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 27m | Avg: 34m 35s | Max: 41m 36s | Hits:  29%/4905  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 32m | Avg: 38m 11s | Max: 40m 02s | Hits:   9%/3424  
      🟩 GCC11              Pass: 100%/7   | Total:  4h 44m | Avg: 40m 35s | Max: 47m 41s | Hits:  52%/5978  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 31m | Avg: 37m 51s | Max: 38m 14s | Hits:  52%/3416  
      🔍 GCC13              Pass:  96%/28  | Total: 11h 04m | Avg: 23m 43s | Max: 38m 41s | Hits:  64%/23058 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 04m | Avg: 41m 24s | Max: 43m 37s | Hits:   5%/2340  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 46m 05s | Avg: 46m 05s | Max: 46m 05s | Hits:  55%/697   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 22s | Max: 48m 13s | Hits:  55%/1394  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 22m | Avg: 47m 28s | Max: 49m 32s | Hits:  55%/2091  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/59  | Total:  1d 06h | Avg: 31m 05s | Max:  1h 09m | Hits:  49%/49906 
      🔍 GCC                Pass:  98%/63  | Total:  1d 08h | Avg: 30m 37s | Max: 47m 41s | Hits:  48%/52147 
      🟩 Intel              Pass: 100%/3   | Total:  2h 04m | Avg: 41m 24s | Max: 43m 37s | Hits:   5%/2340  
      🟩 MSVC               Pass: 100%/6   | Total:  4h 39m | Avg: 46m 32s | Max: 49m 32s | Hits:  55%/4182  
    🔍 jobs: GraphCapture 🔍
      🟩 Build              Pass: 100%/99  | Total:  2d 10h | Avg: 35m 19s | Max: 49m 32s | Hits:  31%/82101 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 19m | Avg: 17m 23s | Max: 19m 46s | Hits:  99%/6832  
      🔍 GraphCapture       Pass:  87%/8   | Total:  1h 44m | Avg: 13m 06s | Max: 16m 29s | Hits:  99%/5978  
      🟩 HostLaunch         Pass: 100%/8   | Total:  3h 56m | Avg: 29m 31s | Max:  1h 09m | Hits:  99%/6832  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 11m | Avg: 23m 52s | Max: 26m 21s | Hits:  99%/6832  
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/34  | Total: 17h 19m | Avg: 30m 34s | Max: 44m 20s | Hits:  45%/28605 
      🟩 14                 Pass: 100%/37  | Total: 20h 42m | Avg: 33m 34s | Max:  1h 09m | Hits:  47%/30696 
      🟩 17                 Pass: 100%/36  | Total: 19h 09m | Avg: 31m 56s | Max: 44m 27s | Hits:  47%/29927 
      🔍 20                 Pass:  95%/24  | Total: 12h 15m | Avg: 30m 39s | Max: 52m 13s | Hits:  54%/19347 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total:  2d 21h | Avg: 31m 48s | Max:  1h 09m | Hits:  48%/108575
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🟩 90a                Pass: 100%/4   | Total:  1h 22m | Avg: 20m 34s | Max: 22m 46s | Hits:   9%/3416  
    
  • 🟩 thrust: Pass: 100%/118 | Total: 2d 08h | Avg: 28m 45s | Max: 1h 05m | Hits: 33%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  2d 04h | Avg: 28m 34s | Max:  1h 05m | Hits:  34%/129492
      🟩 arm64              Pass: 100%/8   | Total:  4h 09m | Avg: 31m 08s | Max: 35m 36s | Hits:  17%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  7h 29m | Avg: 29m 59s | Max: 58m 51s | Hits:  19%/17660 
      🟩 11.8               Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  1d 23h | Avg: 28m 15s | Max:  1h 05m | Hits:  35%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 55s | Max: 32m 51s | Hits:  19%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  7h 29m | Avg: 29m 59s | Max: 58m 51s | Hits:  19%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 22h | Avg: 28m 12s | Max:  1h 05m | Hits:  36%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 55s | Max: 32m 51s | Hits:  19%/2354  
      🟩 nvcc               Pass: 100%/116 | Total:  2d 07h | Avg: 28m 42s | Max:  1h 05m | Hits:  33%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 49m | Avg: 28m 11s | Max: 32m 28s | Hits:  18%/7062  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 35m | Avg: 31m 50s | Max: 35m 29s | Hits:  15%/3531  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 06m | Avg: 31m 31s | Max: 37m 26s | Hits:  15%/4708  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 00m | Avg: 30m 05s | Max: 32m 11s | Hits:  15%/4708  
      🟩 Clang13            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 17s | Max: 31m 08s | Hits:  15%/4708  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 03m | Avg: 30m 57s | Max: 33m 06s | Hits:  15%/4708  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 02m | Avg: 30m 30s | Max: 33m 36s | Hits:  19%/4708  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 01m | Avg: 30m 27s | Max: 32m 17s | Hits:  19%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  6h 24m | Avg: 21m 21s | Max: 35m 36s | Hits:  56%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 55m 35s | Avg: 27m 47s | Max: 29m 43s | Hits:  20%/2354  
      🟩 GCC7               Pass: 100%/6   | Total:  2h 57m | Avg: 29m 36s | Max: 34m 52s | Hits:  17%/7068  
      🟩 GCC8               Pass: 100%/6   | Total:  2h 55m | Avg: 29m 17s | Max: 33m 17s | Hits:  17%/7068  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 00m | Avg: 30m 02s | Max: 32m 56s | Hits:  17%/7068  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 16m | Avg: 34m 07s | Max: 37m 04s | Hits:  15%/4712  
      🟩 GCC11              Pass: 100%/7   | Total:  3h 46m | Avg: 32m 22s | Max: 42m 50s | Hits:  38%/8246  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 09m | Avg: 32m 23s | Max: 35m 24s | Hits:  19%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  6h 27m | Avg: 19m 21s | Max: 33m 45s | Hits:  58%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 05m | Avg: 41m 57s | Max: 45m 10s | Hits:   4%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 58m 51s | Avg: 58m 51s | Max: 58m 51s | Hits:  17%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 16s | Max: 58m 53s | Hits:  17%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  4h 02m | Avg: 40m 21s | Max:  1h 05m | Hits:  58%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 23h 00m | Avg: 27m 03s | Max: 37m 26s | Hits:  30%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  1d 00h | Avg: 26m 42s | Max: 42m 50s | Hits:  35%/64788 
      🟩 Intel              Pass: 100%/3   | Total:  2h 05m | Avg: 41m 57s | Max: 45m 10s | Hits:   4%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  6h 57m | Avg: 46m 23s | Max:  1h 05m | Hits:  44%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  2d 08h | Avg: 28m 45s | Max:  1h 05m | Hits:  33%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 05h | Avg: 32m 13s | Max:  1h 05m | Hits:  20%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 27s | Max: 19m 08s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 38m | Avg: 12m 22s | Max: 13m 51s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 90a                Pass: 100%/4   | Total:  1h 18m | Avg: 19m 32s | Max: 21m 33s | Hits:  15%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total: 12h 07m | Avg: 24m 14s | Max: 35m 54s | Hits:  32%/35328 
      🟩 14                 Pass: 100%/34  | Total: 17h 28m | Avg: 30m 50s | Max:  1h 02m | Hits:  31%/40020 
      🟩 17                 Pass: 100%/33  | Total: 17h 08m | Avg: 31m 10s | Max:  1h 05m | Hits:  32%/38847 
      🟩 20                 Pass: 100%/21  | Total:  9h 48m | Avg: 28m 01s | Max:  1h 01m | Hits:  39%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟩 CI finished in 15h 13m: Pass: 100%/250 | Total: 5d 06h | Avg: 30m 19s | Max: 1h 09m | Hits: 40%/248341
  • 🟩 cub: Pass: 100%/131 | Total: 2d 21h | Avg: 31m 53s | Max: 1h 09m | Hits: 48%/109429

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total:  2d 16h | Avg: 31m 38s | Max:  1h 09m | Hits:  49%/102597
      🟩 arm64              Pass: 100%/8   | Total:  4h 46m | Avg: 35m 50s | Max: 37m 55s | Hits:  31%/6832  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  7h 58m | Avg: 31m 52s | Max: 46m 05s | Hits:  51%/11598 
      🟩 11.8               Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🟩 12.5               Pass: 100%/113 | Total:  2d 11h | Avg: 31m 32s | Max:  1h 09m | Hits:  48%/95269 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 38s | Avg: 23m 49s | Max: 23m 55s | Hits:  56%/1412  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  7h 58m | Avg: 31m 52s | Max: 46m 05s | Hits:  51%/11598 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🟩 nvcc12.5           Pass: 100%/111 | Total:  2d 10h | Avg: 31m 40s | Max:  1h 09m | Hits:  48%/93857 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 38s | Avg: 23m 49s | Max: 23m 55s | Hits:  56%/1412  
      🟩 nvcc               Pass: 100%/129 | Total:  2d 20h | Avg: 32m 01s | Max:  1h 09m | Hits:  48%/108017
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  3h 09m | Avg: 31m 33s | Max: 35m 04s | Hits:  30%/4902  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 48m | Avg: 36m 11s | Max: 38m 40s | Hits:  10%/2568  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 17m | Avg: 34m 20s | Max: 35m 33s | Hits:  10%/3424  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 22m | Avg: 35m 44s | Max: 38m 03s | Hits:  10%/3424  
      🟩 Clang13            Pass: 100%/4   | Total:  2h 18m | Avg: 34m 43s | Max: 37m 12s | Hits:  10%/3424  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 19m | Avg: 34m 57s | Max: 37m 24s | Hits:  10%/3424  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 17m | Avg: 34m 17s | Max: 35m 56s | Hits:  52%/3416  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 21m | Avg: 35m 19s | Max: 36m 57s | Hits:  52%/3416  
      🟩 Clang17            Pass: 100%/26  | Total: 11h 38m | Avg: 26m 52s | Max:  1h 09m | Hits:  82%/21908 
      🟩 GCC6               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 53s | Max: 34m 32s | Hits:  51%/1556  
      🟩 GCC7               Pass: 100%/6   | Total:  3h 24m | Avg: 34m 03s | Max: 38m 17s | Hits:  29%/4905  
      🟩 GCC8               Pass: 100%/6   | Total:  3h 17m | Avg: 32m 57s | Max: 37m 24s | Hits:  29%/4905  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 27m | Avg: 34m 35s | Max: 41m 36s | Hits:  29%/4905  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 32m | Avg: 38m 11s | Max: 40m 02s | Hits:   9%/3424  
      🟩 GCC11              Pass: 100%/7   | Total:  4h 44m | Avg: 40m 35s | Max: 47m 41s | Hits:  52%/5978  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 31m | Avg: 37m 51s | Max: 38m 14s | Hits:  52%/3416  
      🟩 GCC13              Pass: 100%/28  | Total: 11h 14m | Avg: 24m 06s | Max: 38m 41s | Hits:  65%/23912 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 04m | Avg: 41m 24s | Max: 43m 37s | Hits:   5%/2340  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 46m 05s | Avg: 46m 05s | Max: 46m 05s | Hits:  55%/697   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 22s | Max: 48m 13s | Hits:  55%/1394  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 22m | Avg: 47m 28s | Max: 49m 32s | Hits:  55%/2091  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  1d 06h | Avg: 31m 05s | Max:  1h 09m | Hits:  49%/49906 
      🟩 GCC                Pass: 100%/63  | Total:  1d 08h | Avg: 30m 48s | Max: 47m 41s | Hits:  49%/53001 
      🟩 Intel              Pass: 100%/3   | Total:  2h 04m | Avg: 41m 24s | Max: 43m 37s | Hits:   5%/2340  
      🟩 MSVC               Pass: 100%/6   | Total:  4h 39m | Avg: 46m 32s | Max: 49m 32s | Hits:  55%/4182  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total:  2d 21h | Avg: 31m 53s | Max:  1h 09m | Hits:  48%/109429
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 10h | Avg: 35m 19s | Max: 49m 32s | Hits:  31%/82101 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 19m | Avg: 17m 23s | Max: 19m 46s | Hits:  99%/6832  
      🟩 GraphCapture       Pass: 100%/8   | Total:  1h 55m | Avg: 14m 27s | Max: 16m 29s | Hits:  99%/6832  
      🟩 HostLaunch         Pass: 100%/8   | Total:  3h 56m | Avg: 29m 31s | Max:  1h 09m | Hits:  99%/6832  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 11m | Avg: 23m 52s | Max: 26m 21s | Hits:  99%/6832  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  2h 16m | Avg: 45m 29s | Max: 47m 41s | Hits:  52%/2562  
      🟩 90a                Pass: 100%/4   | Total:  1h 22m | Avg: 20m 34s | Max: 22m 46s | Hits:   9%/3416  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total: 17h 19m | Avg: 30m 34s | Max: 44m 20s | Hits:  45%/28605 
      🟩 14                 Pass: 100%/37  | Total: 20h 42m | Avg: 33m 34s | Max:  1h 09m | Hits:  47%/30696 
      🟩 17                 Pass: 100%/36  | Total: 19h 09m | Avg: 31m 56s | Max: 44m 27s | Hits:  47%/29927 
      🟩 20                 Pass: 100%/24  | Total: 12h 26m | Avg: 31m 06s | Max: 52m 13s | Hits:  56%/20201 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 2d 08h | Avg: 28m 45s | Max: 1h 05m | Hits: 33%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  2d 04h | Avg: 28m 34s | Max:  1h 05m | Hits:  34%/129492
      🟩 arm64              Pass: 100%/8   | Total:  4h 09m | Avg: 31m 08s | Max: 35m 36s | Hits:  17%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  7h 29m | Avg: 29m 59s | Max: 58m 51s | Hits:  19%/17660 
      🟩 11.8               Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  1d 23h | Avg: 28m 15s | Max:  1h 05m | Hits:  35%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 55s | Max: 32m 51s | Hits:  19%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  7h 29m | Avg: 29m 59s | Max: 58m 51s | Hits:  19%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 22h | Avg: 28m 12s | Max:  1h 05m | Hits:  36%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 55s | Max: 32m 51s | Hits:  19%/2354  
      🟩 nvcc               Pass: 100%/116 | Total:  2d 07h | Avg: 28m 42s | Max:  1h 05m | Hits:  33%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 49m | Avg: 28m 11s | Max: 32m 28s | Hits:  18%/7062  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 35m | Avg: 31m 50s | Max: 35m 29s | Hits:  15%/3531  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 06m | Avg: 31m 31s | Max: 37m 26s | Hits:  15%/4708  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 00m | Avg: 30m 05s | Max: 32m 11s | Hits:  15%/4708  
      🟩 Clang13            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 17s | Max: 31m 08s | Hits:  15%/4708  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 03m | Avg: 30m 57s | Max: 33m 06s | Hits:  15%/4708  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 02m | Avg: 30m 30s | Max: 33m 36s | Hits:  19%/4708  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 01m | Avg: 30m 27s | Max: 32m 17s | Hits:  19%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  6h 24m | Avg: 21m 21s | Max: 35m 36s | Hits:  56%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 55m 35s | Avg: 27m 47s | Max: 29m 43s | Hits:  20%/2354  
      🟩 GCC7               Pass: 100%/6   | Total:  2h 57m | Avg: 29m 36s | Max: 34m 52s | Hits:  17%/7068  
      🟩 GCC8               Pass: 100%/6   | Total:  2h 55m | Avg: 29m 17s | Max: 33m 17s | Hits:  17%/7068  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 00m | Avg: 30m 02s | Max: 32m 56s | Hits:  17%/7068  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 16m | Avg: 34m 07s | Max: 37m 04s | Hits:  15%/4712  
      🟩 GCC11              Pass: 100%/7   | Total:  3h 46m | Avg: 32m 22s | Max: 42m 50s | Hits:  38%/8246  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 09m | Avg: 32m 23s | Max: 35m 24s | Hits:  19%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  6h 27m | Avg: 19m 21s | Max: 33m 45s | Hits:  58%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 05m | Avg: 41m 57s | Max: 45m 10s | Hits:   4%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 58m 51s | Avg: 58m 51s | Max: 58m 51s | Hits:  17%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 16s | Max: 58m 53s | Hits:  17%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  4h 02m | Avg: 40m 21s | Max:  1h 05m | Hits:  58%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 23h 00m | Avg: 27m 03s | Max: 37m 26s | Hits:  30%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  1d 00h | Avg: 26m 42s | Max: 42m 50s | Hits:  35%/64788 
      🟩 Intel              Pass: 100%/3   | Total:  2h 05m | Avg: 41m 57s | Max: 45m 10s | Hits:   4%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  6h 57m | Avg: 46m 23s | Max:  1h 05m | Hits:  44%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  2d 08h | Avg: 28m 45s | Max:  1h 05m | Hits:  33%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 05h | Avg: 32m 13s | Max:  1h 05m | Hits:  20%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 27s | Max: 19m 08s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 38m | Avg: 12m 22s | Max: 13m 51s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 56m | Avg: 38m 45s | Max: 42m 50s | Hits:  20%/3534  
      🟩 90a                Pass: 100%/4   | Total:  1h 18m | Avg: 19m 32s | Max: 21m 33s | Hits:  15%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total: 12h 07m | Avg: 24m 14s | Max: 35m 54s | Hits:  32%/35328 
      🟩 14                 Pass: 100%/34  | Total: 17h 28m | Avg: 30m 50s | Max:  1h 02m | Hits:  31%/40020 
      🟩 17                 Pass: 100%/33  | Total: 17h 08m | Avg: 31m 10s | Max:  1h 05m | Hits:  32%/38847 
      🟩 20                 Pass: 100%/21  | Total:  9h 48m | Avg: 28m 01s | Max:  1h 01m | Hits:  39%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@alliepiper
Copy link
Collaborator

alliepiper commented Jul 18, 2024

You've found another can of worms! 😄

We've had many discussions about these intermediate accumulator types, and IIRC we decided on determining the return type of op(init, input) and using that for the accumulations. For instance, in CUB scan:

https://github.com/NVIDIA/cccl/blob/main/cub/cub/device/dispatch/dispatch_scan.cuh#L237-L241

/// The type of intermediate accumulator (according to P2322R6)
template <typename Invokable, typename InitT, typename InputT>
using accumulator_t = typename ::cuda::std::decay<invoke_result_t<Invokable, InitT, InputT>>::type;

If we're going to break behavior, we should use the same approach in the new version, and see if there are any other algorithms that need to be fixed. It'd be worth discussing this in the team meeting before putting too much work in here in case we'd need to delay this to a major release.

@bernhardmgruber bernhardmgruber force-pushed the partial_sum_type branch 2 times, most recently from 8af06ab to 949cc83 Compare July 18, 2024 20:00
@bernhardmgruber
Copy link
Contributor Author

You've found another can of worms! 😄

I hope I get a prize at some point!

We've had many discussions about these intermediate accumulator types, and IIRC we decided on determining the return type of op(init, input) and using that for the accumulations.

That's perfect, I will use this then!

If we're going to break behavior, we should use the same approach in the new version, and see if there are any other algorithms that need to be fixed.

I looked into reduce_by_key.h for the CUDA backend, and it also uses the input value iterator's type as value_type for partial sums: https://github.com/NVIDIA/cccl/blob/main/thrust/thrust/system/cuda/detail/reduce_by_key.h#L200

It'd be worth discussing this in the team meeting before putting too much work in here in case we'd need to delay this to a major release.

Fair enough, let's discuss it in a team meeting.

@miscco
Copy link
Collaborator

miscco commented Jul 22, 2024

we decided on determining the return type of op(init, input) and using that for the accumulations.

This is also the approach taken by the standard, so this is what we should be doing.

@bernhardmgruber
Copy link
Contributor Author

We discussed this PR during the code review hour and @gevtushenko suggested to check for a nested ::result_type first, before using the logic of CUB's accumulator_t.

We also discussed that we should move CUB's accumulator_t into cudax, but that seems to not be enabled in all cases, so I duplicated the implementation Thrust. If somebody could tell me a good place in the project tree, I can happily move the trait after this PR. I considered a new header in libcu++'s <cuda/type_traits> but this feels odd as well, given the functionality must work for TBB and OpenMP as well.

@bernhardmgruber bernhardmgruber force-pushed the partial_sum_type branch 3 times, most recently from e3f8056 to fc71d1d Compare July 22, 2024 19:31
@bernhardmgruber bernhardmgruber changed the title Use value iterator's value type for partial sums in TBB reduce_by_key Use accumulator_t's logic for partial sums in TBB reduce_by_key Jul 22, 2024
Copy link
Contributor

🟩 CI finished in 3h 40m: Pass: 100%/250 | Total: 4d 23h | Avg: 28m 44s | Max: 1h 00m | Hits: 61%/248341
  • 🟩 cub: Pass: 100%/131 | Total: 2d 18h | Avg: 30m 22s | Max: 51m 10s | Hits: 70%/109429

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total:  2d 13h | Avg: 30m 06s | Max: 51m 10s | Hits:  71%/102597
      🟩 arm64              Pass: 100%/8   | Total:  4h 35m | Avg: 34m 29s | Max: 39m 06s | Hits:  61%/6832  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  7h 39m | Avg: 30m 39s | Max: 44m 35s | Hits:  61%/11598 
      🟩 11.8               Pass: 100%/3   | Total:  2h 15m | Avg: 45m 14s | Max: 51m 10s | Hits:  60%/2562  
      🟩 12.5               Pass: 100%/113 | Total:  2d 08h | Avg: 29m 56s | Max: 46m 46s | Hits:  72%/95269 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 42m 16s | Avg: 21m 08s | Max: 21m 36s | Hits:  66%/1412  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  7h 39m | Avg: 30m 39s | Max: 44m 35s | Hits:  61%/11598 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  2h 15m | Avg: 45m 14s | Max: 51m 10s | Hits:  60%/2562  
      🟩 nvcc12.5           Pass: 100%/111 | Total:  2d 07h | Avg: 30m 06s | Max: 46m 46s | Hits:  72%/93857 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 16s | Avg: 21m 08s | Max: 21m 36s | Hits:  66%/1412  
      🟩 nvcc               Pass: 100%/129 | Total:  2d 17h | Avg: 30m 31s | Max: 51m 10s | Hits:  70%/108017
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  3h 08m | Avg: 31m 29s | Max: 35m 49s | Hits:  61%/4902  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 48m | Avg: 36m 15s | Max: 36m 56s | Hits:  61%/2568  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 14m | Avg: 33m 38s | Max: 34m 31s | Hits:  61%/3424  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 20m | Avg: 35m 08s | Max: 37m 12s | Hits:  61%/3424  
      🟩 Clang13            Pass: 100%/4   | Total:  2h 14m | Avg: 33m 32s | Max: 35m 13s | Hits:  61%/3424  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 14m | Avg: 33m 37s | Max: 34m 39s | Hits:  61%/3424  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 17m | Avg: 34m 19s | Max: 35m 05s | Hits:  61%/3416  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 16m | Avg: 34m 01s | Max: 36m 10s | Hits:  61%/3416  
      🟩 Clang17            Pass: 100%/26  | Total: 10h 16m | Avg: 23m 41s | Max: 39m 14s | Hits:  85%/21908 
      🟩 GCC6               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 04s | Max: 34m 12s | Hits:  60%/1556  
      🟩 GCC7               Pass: 100%/6   | Total:  3h 04m | Avg: 30m 41s | Max: 32m 37s | Hits:  60%/4905  
      🟩 GCC8               Pass: 100%/6   | Total:  3h 15m | Avg: 32m 34s | Max: 36m 53s | Hits:  60%/4905  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 27m | Avg: 34m 38s | Max: 41m 37s | Hits:  60%/4905  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 26m | Avg: 36m 42s | Max: 38m 04s | Hits:  60%/3424  
      🟩 GCC11              Pass: 100%/7   | Total:  4h 36m | Avg: 39m 27s | Max: 51m 10s | Hits:  60%/5978  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 21m | Avg: 35m 26s | Max: 36m 52s | Hits:  60%/3416  
      🟩 GCC13              Pass: 100%/28  | Total: 10h 57m | Avg: 23m 28s | Max: 39m 06s | Hits:  82%/23912 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 52m | Avg: 37m 27s | Max: 38m 23s | Hits:  61%/2340  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 44m 35s | Avg: 44m 35s | Max: 44m 35s | Hits:  65%/697   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 28m | Avg: 44m 19s | Max: 46m 46s | Hits:  65%/1394  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 11m | Avg: 43m 46s | Max: 43m 59s | Hits:  65%/2091  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  1d 04h | Avg: 29m 20s | Max: 39m 14s | Hits:  72%/49906 
      🟩 GCC                Pass: 100%/63  | Total:  1d 07h | Avg: 29m 42s | Max: 51m 10s | Hits:  70%/53001 
      🟩 Intel              Pass: 100%/3   | Total:  1h 52m | Avg: 37m 27s | Max: 38m 23s | Hits:  61%/2340  
      🟩 MSVC               Pass: 100%/6   | Total:  4h 24m | Avg: 44m 05s | Max: 46m 46s | Hits:  65%/4182  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total:  2d 18h | Avg: 30m 22s | Max: 51m 10s | Hits:  70%/109429
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 08h | Avg: 34m 15s | Max: 51m 10s | Hits:  61%/82101 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 24m | Avg: 18m 00s | Max: 19m 07s | Hits:  99%/6832  
      🟩 GraphCapture       Pass: 100%/8   | Total:  1h 55m | Avg: 14m 23s | Max: 16m 10s | Hits:  99%/6832  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 10m | Avg: 16m 15s | Max: 17m 46s | Hits:  99%/6832  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 19m | Avg: 24m 53s | Max: 27m 21s | Hits:  99%/6832  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  2h 15m | Avg: 45m 14s | Max: 51m 10s | Hits:  60%/2562  
      🟩 90a                Pass: 100%/4   | Total:  1h 14m | Avg: 18m 40s | Max: 20m 06s | Hits:  60%/3416  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total: 16h 47m | Avg: 29m 37s | Max: 41m 10s | Hits:  70%/28605 
      🟩 14                 Pass: 100%/37  | Total: 19h 30m | Avg: 31m 38s | Max: 51m 10s | Hits:  69%/30696 
      🟩 17                 Pass: 100%/36  | Total: 18h 29m | Avg: 30m 49s | Max: 43m 31s | Hits:  70%/29927 
      🟩 20                 Pass: 100%/24  | Total: 11h 32m | Avg: 28m 50s | Max: 43m 50s | Hits:  74%/20201 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 2d 05h | Avg: 27m 03s | Max: 1h 00m | Hits: 54%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  2d 01h | Avg: 27m 02s | Max:  1h 00m | Hits:  54%/129492
      🟩 arm64              Pass: 100%/8   | Total:  3h 38m | Avg: 27m 15s | Max: 30m 07s | Hits:  49%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  6h 40m | Avg: 26m 41s | Max: 55m 03s | Hits:  45%/17660 
      🟩 11.8               Pass: 100%/3   | Total:  1h 53m | Avg: 37m 50s | Max: 38m 43s | Hits:  47%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  1d 20h | Avg: 26m 47s | Max:  1h 00m | Hits:  55%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 56m 06s | Avg: 28m 03s | Max: 29m 34s | Hits:  47%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 40m | Avg: 26m 41s | Max: 55m 03s | Hits:  45%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 53m | Avg: 37m 50s | Max: 38m 43s | Hits:  47%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 19h | Avg: 26m 45s | Max:  1h 00m | Hits:  55%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 56m 06s | Avg: 28m 03s | Max: 29m 34s | Hits:  47%/2354  
      🟩 nvcc               Pass: 100%/116 | Total:  2d 04h | Avg: 27m 02s | Max:  1h 00m | Hits:  54%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 40m | Avg: 26m 49s | Max: 34m 01s | Hits:  39%/7062  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 32m | Avg: 30m 47s | Max: 34m 13s | Hits:  34%/3531  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 00m | Avg: 30m 05s | Max: 33m 43s | Hits:  34%/4708  
      🟩 Clang12            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 16s | Max: 31m 38s | Hits:  34%/4708  
      🟩 Clang13            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 24s | Max: 30m 57s | Hits:  34%/4708  
      🟩 Clang14            Pass: 100%/4   | Total:  1h 50m | Avg: 27m 42s | Max: 29m 23s | Hits:  47%/4708  
      🟩 Clang15            Pass: 100%/4   | Total:  1h 51m | Avg: 27m 52s | Max: 30m 51s | Hits:  47%/4708  
      🟩 Clang16            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 18s | Max: 32m 01s | Hits:  47%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  5h 53m | Avg: 19m 38s | Max: 32m 17s | Hits:  71%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 45m 57s | Avg: 22m 58s | Max: 24m 38s | Hits:  48%/2354  
      🟩 GCC7               Pass: 100%/6   | Total:  2h 36m | Avg: 26m 09s | Max: 29m 34s | Hits:  47%/7068  
      🟩 GCC8               Pass: 100%/6   | Total:  2h 42m | Avg: 27m 01s | Max: 34m 24s | Hits:  47%/7068  
      🟩 GCC9               Pass: 100%/6   | Total:  2h 48m | Avg: 28m 05s | Max: 33m 06s | Hits:  47%/7068  
      🟩 GCC10              Pass: 100%/4   | Total:  1h 57m | Avg: 29m 25s | Max: 32m 20s | Hits:  47%/4712  
      🟩 GCC11              Pass: 100%/7   | Total:  3h 49m | Avg: 32m 45s | Max: 38m 43s | Hits:  56%/8246  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 06m | Avg: 31m 33s | Max: 35m 15s | Hits:  47%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  6h 07m | Avg: 18m 22s | Max: 30m 07s | Hits:  72%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 03m | Avg: 41m 14s | Max: 45m 13s | Hits:  34%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 55m 03s | Avg: 55m 03s | Max: 55m 03s | Hits:  32%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 43s | Max: 57m 20s | Hits:  32%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 46m | Avg: 37m 48s | Max:  1h 00m | Hits:  65%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 21h 41m | Avg: 25m 30s | Max: 34m 13s | Hits:  51%/60027 
      🟩 GCC                Pass: 100%/55  | Total: 22h 54m | Avg: 24m 59s | Max: 38m 43s | Hits:  57%/64788 
      🟩 Intel              Pass: 100%/3   | Total:  2h 03m | Avg: 41m 14s | Max: 45m 13s | Hits:  34%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  6h 33m | Avg: 43m 42s | Max:  1h 00m | Hits:  54%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  2d 05h | Avg: 27m 03s | Max:  1h 00m | Hits:  54%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 01h | Avg: 30m 09s | Max:  1h 00m | Hits:  45%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 43m | Avg:  9m 23s | Max: 18m 41s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 43m | Avg: 12m 54s | Max: 16m 08s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 53m | Avg: 37m 50s | Max: 38m 43s | Hits:  47%/3534  
      🟩 90a                Pass: 100%/4   | Total:  1h 10m | Avg: 17m 31s | Max: 18m 45s | Hits:  47%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total: 11h 11m | Avg: 22m 23s | Max: 36m 57s | Hits:  55%/35328 
      🟩 14                 Pass: 100%/34  | Total: 16h 25m | Avg: 28m 59s | Max: 55m 03s | Hits:  51%/40020 
      🟩 17                 Pass: 100%/33  | Total: 16h 04m | Avg: 29m 14s | Max: 58m 53s | Hits:  53%/38847 
      🟩 20                 Pass: 100%/21  | Total:  9h 30m | Avg: 27m 09s | Max:  1h 00m | Hits:  58%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

bernhardmgruber and others added 4 commits July 25, 2024 21:21
This allows us to get rid of partial_sum_type, which still uses the C++11-deprecated function object API ::result_type.
Co-authored-by: Georgii Evtushenko <[email protected]>
Copy link
Contributor

🟨 CI finished in 7h 28m: Pass: 95%/250 | Total: 1d 07h | Avg: 7m 38s | Max: 28m 26s | Hits: 90%/239877
  • 🟨 cub: Pass: 93%/131 | Total: 17h 03m | Avg: 7m 48s | Max: 28m 26s | Hits: 99%/103321

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  92%/123 | Total: 16h 22m | Avg:  7m 59s | Max: 28m 26s | Hits:  99%/96385 
      🟩 arm64              Pass: 100%/8   | Total: 40m 27s | Avg:  5m 03s | Max:  5m 32s | Hits:  99%/6936  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 04m | Avg:  4m 17s | Max: 12m 38s | Hits:  99%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🔍 12.5               Pass:  92%/113 | Total: 15h 45m | Avg:  8m 21s | Max: 28m 26s | Hits:  99%/88928 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  3m 36s | Hits: 100%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 04m | Avg:  4m 17s | Max: 12m 38s | Hits:  99%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🔍 nvcc12.5           Pass:  91%/111 | Total: 15h 38m | Avg:  8m 27s | Max: 28m 26s | Hits:  99%/87492 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  3m 36s | Hits: 100%/1436  
      🔍 nvcc               Pass:  93%/129 | Total: 16h 56m | Avg:  7m 52s | Max: 28m 26s | Hits:  99%/101885
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 27m 08s | Avg:  4m 31s | Max:  5m 25s | Hits: 100%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 15m 50s | Avg:  5m 16s | Max:  5m 45s | Hits: 100%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 18m 04s | Avg:  4m 31s | Max:  4m 55s | Hits: 100%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 18m 35s | Avg:  4m 38s | Max:  4m 52s | Hits: 100%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 17m 30s | Avg:  4m 22s | Max:  4m 34s | Hits: 100%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 47s | Avg:  4m 26s | Max:  4m 43s | Hits: 100%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 18m 28s | Avg:  4m 37s | Max:  4m 41s | Hits: 100%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 18m 21s | Avg:  4m 35s | Max:  4m 48s | Hits: 100%/3468  
      🟨 Clang17            Pass:  88%/26  | Total:  5h 37m | Avg: 12m 58s | Max: 28m 26s | Hits: 100%/19643 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 29s | Hits:  99%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 23m 45s | Avg:  3m 57s | Max:  4m 30s | Hits:  99%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 24m 40s | Avg:  4m 06s | Max:  4m 30s | Hits:  99%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 23m 31s | Avg:  3m 55s | Max:  4m 15s | Hits:  99%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 17m 17s | Avg:  4m 19s | Max:  4m 36s | Hits:  99%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 31m 57s | Avg:  4m 33s | Max:  4m 50s | Hits:  99%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 18m 57s | Avg:  4m 44s | Max:  4m 58s | Hits:  99%/3468  
      🟨 GCC13              Pass:  78%/28  | Total:  5h 05m | Avg: 10m 54s | Max: 26m 38s | Hits:  99%/19074 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 41s | Avg:  5m 13s | Max:  5m 20s | Hits: 100%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 38s | Avg: 12m 38s | Max: 12m 38s | Hits:  99%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 20m 42s | Avg: 10m 21s | Max: 10m 35s | Hits:  99%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 32m 31s | Avg: 10m 50s | Max: 11m 10s | Hits:  99%/2127  
    🟨 cxx_family
      🟨 Clang              Pass:  94%/59  | Total:  8h 09m | Avg:  8m 17s | Max: 28m 26s | Hits: 100%/48070 
      🟨 GCC                Pass:  90%/63  | Total:  7h 32m | Avg:  7m 11s | Max: 26m 38s | Hits:  99%/48618 
      🟩 Intel              Pass: 100%/3   | Total: 15m 41s | Avg:  5m 13s | Max:  5m 20s | Hits: 100%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 05m | Avg: 10m 58s | Max: 12m 38s | Hits:  99%/4254  
    🟨 jobs
      🟩 Build              Pass: 100%/99  | Total:  8h 00m | Avg:  4m 51s | Max: 12m 38s | Hits:  99%/83380 
      🟨 DeviceLaunch       Pass:  75%/8   | Total:  2h 19m | Avg: 17m 26s | Max: 22m 47s | Hits:  99%/5202  
      🟨 GraphCapture       Pass:  75%/8   | Total:  1h 59m | Avg: 14m 56s | Max: 19m 11s | Hits:  99%/5202  
      🟨 HostLaunch         Pass:  75%/8   | Total:  2h 12m | Avg: 16m 31s | Max: 26m 36s | Hits:  99%/5202  
      🟨 TestGPU            Pass:  62%/8   | Total:  2h 31m | Avg: 18m 59s | Max: 28m 26s | Hits:  99%/4335  
    🟨 std
      🟨 11                 Pass:  94%/34  | Total:  4h 07m | Avg:  7m 16s | Max: 28m 26s | Hits:  99%/27313 
      🟨 14                 Pass:  89%/37  | Total:  4h 15m | Avg:  6m 53s | Max: 22m 36s | Hits:  99%/27706 
      🟨 17                 Pass:  91%/36  | Total:  4h 34m | Avg:  7m 38s | Max: 26m 36s | Hits:  99%/27791 
      🟩 20                 Pass: 100%/24  | Total:  4h 06m | Avg: 10m 15s | Max: 26m 38s | Hits:  99%/20511 
    🟨 gpu
      🟨 v100               Pass:  93%/131 | Total: 17h 03m | Avg:  7m 48s | Max: 28m 26s | Hits:  99%/103321
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🟩 90a                Pass: 100%/4   | Total: 14m 33s | Avg:  3m 38s | Max:  3m 47s | Hits:  99%/3468  
    
  • 🟨 thrust: Pass: 98%/118 | Total: 14h 44m | Avg: 7m 29s | Max: 21m 50s | Hits: 83%/136556

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  98%/110 | Total: 13h 54m | Avg:  7m 35s | Max: 21m 50s | Hits:  83%/127136
      🟩 arm64              Pass: 100%/8   | Total: 49m 40s | Avg:  6m 12s | Max:  6m 42s | Hits:  80%/9420  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 51m | Avg:  7m 25s | Max: 21m 50s | Hits:  76%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🔍 12.5               Pass:  98%/100 | Total: 12h 35m | Avg:  7m 33s | Max: 19m 27s | Hits:  84%/115362
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  6m 20s | Hits:  80%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 51m | Avg:  7m 25s | Max: 21m 50s | Hits:  76%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🔍 nvcc12.5           Pass:  97%/98  | Total: 12h 22m | Avg:  7m 34s | Max: 19m 27s | Hits:  84%/113008
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  6m 20s | Hits:  80%/2354  
      🔍 nvcc               Pass:  98%/116 | Total: 14h 31m | Avg:  7m 30s | Max: 21m 50s | Hits:  83%/134202
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total: 35m 41s | Avg:  5m 56s | Max:  6m 21s | Hits:  80%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 20m 09s | Avg:  6m 43s | Max:  7m 07s | Hits:  80%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 24m 10s | Avg:  6m 02s | Max:  6m 17s | Hits:  80%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 24m 30s | Avg:  6m 07s | Max:  6m 33s | Hits:  80%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 24m 13s | Avg:  6m 03s | Max:  6m 19s | Hits:  80%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 25m 05s | Avg:  6m 16s | Max:  6m 25s | Hits:  80%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 24m 27s | Avg:  6m 06s | Max:  6m 19s | Hits:  80%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 25m 45s | Avg:  6m 26s | Max:  6m 35s | Hits:  80%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 16m | Avg:  7m 36s | Max: 14m 41s | Hits:  89%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 10m 50s | Avg:  5m 25s | Max:  5m 39s | Hits:  80%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 47m 53s | Avg:  7m 58s | Max: 21m 50s | Hits:  70%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 33m 44s | Avg:  5m 37s | Max:  6m 14s | Hits:  80%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 34m 07s | Avg:  5m 41s | Max:  6m 09s | Hits:  80%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 24m 14s | Avg:  6m 03s | Max:  6m 30s | Hits:  80%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 35m 16s | Avg:  5m 02s | Max:  6m 22s | Hits:  88%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 25m 13s | Avg:  6m 18s | Max:  6m 46s | Hits:  80%/4712  
      🔍 GCC13              Pass:  90%/20  | Total:  2h 23m | Avg:  7m 11s | Max: 15m 28s | Hits:  89%/21204 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 27m 52s | Avg:  9m 17s | Max: 10m 01s | Hits:  80%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 18m 48s | Avg: 18m 48s | Max: 18m 48s | Hits:  79%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 32m 43s | Avg: 16m 21s | Max: 16m 34s | Hits:  79%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 48m | Avg: 18m 04s | Max: 19m 27s | Hits:  89%/7038  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/51  | Total:  5h 40m | Avg:  6m 41s | Max: 14m 41s | Hits:  83%/60027 
      🔍 GCC                Pass:  96%/55  | Total:  5h 55m | Avg:  6m 27s | Max: 21m 50s | Hits:  83%/62432 
      🟩 Intel              Pass: 100%/3   | Total: 27m 52s | Avg:  9m 17s | Max: 10m 01s | Hits:  80%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 39m | Avg: 17m 46s | Max: 19m 27s | Hits:  85%/10557 
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/99  | Total: 11h 14m | Avg:  6m 48s | Max: 21m 50s | Hits:  81%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 32s | Max: 19m 27s | Hits:  99%/12939 
      🔍 TestGPU            Pass:  75%/8   | Total:  1h 44m | Avg: 13m 04s | Max: 15m 28s | Hits:  93%/7064  
    🟨 std
      🟩 11                 Pass: 100%/30  | Total:  3h 24m | Avg:  6m 49s | Max: 21m 50s | Hits:  81%/35328 
      🟩 14                 Pass: 100%/34  | Total:  4h 24m | Avg:  7m 47s | Max: 18m 48s | Hits:  83%/40020 
      🟨 17                 Pass:  96%/33  | Total:  4h 09m | Avg:  7m 33s | Max: 19m 27s | Hits:  84%/37669 
      🟨 20                 Pass:  95%/21  | Total:  2h 45m | Avg:  7m 51s | Max: 17m 54s | Hits:  85%/23539 
    🟨 gpu
      🟨 v100               Pass:  98%/118 | Total: 14h 44m | Avg:  7m 29s | Max: 21m 50s | Hits:  83%/136556
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🟩 90a                Pass: 100%/4   | Total: 23m 29s | Avg:  5m 52s | Max:  6m 02s | Hits:  80%/4712  
    
  • 🟥 pycuda: Pass: 0%/1 | Total: 4m 51s | Avg: 4m 51s | Max: 4m 51s

    🟥 cpu
      🟥 amd64              Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 ctk
      🟥 12.5               Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 cudacxx
      🟥 nvcc12.5           Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 cxx
      🟥 GCC13              Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 gpu
      🟥 v100               Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    🟥 jobs
      🟥 Test               Pass:   0%/1   | Total:  4m 51s | Avg:  4m 51s | Max:  4m 51s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber changed the title Use accumulator_t's logic for partial sums in TBB reduce_by_key Avoid ::result_type for partial sums in TBB reduce_by_key Jul 26, 2024
Copy link
Contributor

🟩 CI finished in 19h 19m: Pass: 100%/250 | Total: 1d 10h | Avg: 8m 11s | Max: 28m 26s | Hits: 90%/250036
  • 🟩 cub: Pass: 100%/131 | Total: 18h 58m | Avg: 8m 41s | Max: 28m 26s | Hits: 99%/111124

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total: 18h 17m | Avg:  8m 55s | Max: 28m 26s | Hits:  99%/104188
      🟩 arm64              Pass: 100%/8   | Total: 40m 27s | Avg:  5m 03s | Max:  5m 32s | Hits:  99%/6936  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 04m | Avg:  4m 17s | Max: 12m 38s | Hits:  99%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🟩 12.5               Pass: 100%/113 | Total: 17h 40m | Avg:  9m 22s | Max: 28m 26s | Hits:  99%/96731 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  3m 36s | Hits: 100%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 04m | Avg:  4m 17s | Max: 12m 38s | Hits:  99%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🟩 nvcc12.5           Pass: 100%/111 | Total: 17h 33m | Avg:  9m 29s | Max: 28m 26s | Hits:  99%/95295 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  3m 36s | Hits: 100%/1436  
      🟩 nvcc               Pass: 100%/129 | Total: 18h 51m | Avg:  8m 46s | Max: 28m 26s | Hits:  99%/109688
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 27m 08s | Avg:  4m 31s | Max:  5m 25s | Hits: 100%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 15m 50s | Avg:  5m 16s | Max:  5m 45s | Hits: 100%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 18m 04s | Avg:  4m 31s | Max:  4m 55s | Hits: 100%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 18m 35s | Avg:  4m 38s | Max:  4m 52s | Hits: 100%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 17m 30s | Avg:  4m 22s | Max:  4m 34s | Hits: 100%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 47s | Avg:  4m 26s | Max:  4m 43s | Hits: 100%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 18m 28s | Avg:  4m 37s | Max:  4m 41s | Hits: 100%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 18m 21s | Avg:  4m 35s | Max:  4m 48s | Hits: 100%/3468  
      🟩 Clang17            Pass: 100%/26  | Total:  6h 17m | Avg: 14m 31s | Max: 28m 26s | Hits: 100%/22244 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 29s | Hits:  99%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 23m 45s | Avg:  3m 57s | Max:  4m 30s | Hits:  99%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 24m 40s | Avg:  4m 06s | Max:  4m 30s | Hits:  99%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 23m 31s | Avg:  3m 55s | Max:  4m 15s | Hits:  99%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 17m 17s | Avg:  4m 19s | Max:  4m 36s | Hits:  99%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 31m 57s | Avg:  4m 33s | Max:  4m 50s | Hits:  99%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 18m 57s | Avg:  4m 44s | Max:  4m 58s | Hits:  99%/3468  
      🟩 GCC13              Pass: 100%/28  | Total:  6h 20m | Avg: 13m 35s | Max: 26m 38s | Hits:  99%/24276 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 41s | Avg:  5m 13s | Max:  5m 20s | Hits: 100%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 38s | Avg: 12m 38s | Max: 12m 38s | Hits:  99%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 20m 42s | Avg: 10m 21s | Max: 10m 35s | Hits:  99%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 32m 31s | Avg: 10m 50s | Max: 11m 10s | Hits:  99%/2127  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  8h 49m | Avg:  8m 58s | Max: 28m 26s | Hits: 100%/50671 
      🟩 GCC                Pass: 100%/63  | Total:  8h 47m | Avg:  8m 22s | Max: 26m 38s | Hits:  99%/53820 
      🟩 Intel              Pass: 100%/3   | Total: 15m 41s | Avg:  5m 13s | Max:  5m 20s | Hits: 100%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 05m | Avg: 10m 58s | Max: 12m 38s | Hits:  99%/4254  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total: 18h 58m | Avg:  8m 41s | Max: 28m 26s | Hits:  99%/111124
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  8h 00m | Avg:  4m 51s | Max: 12m 38s | Hits:  99%/83380 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 41m | Avg: 20m 09s | Max: 22m 47s | Hits:  99%/6936  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 15m | Avg: 16m 53s | Max: 19m 11s | Hits:  99%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 42m | Avg: 20m 16s | Max: 26m 36s | Hits:  99%/6936  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 19m | Avg: 24m 56s | Max: 28m 26s | Hits:  99%/6936  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 50s | Hits:  99%/2601  
      🟩 90a                Pass: 100%/4   | Total: 14m 33s | Avg:  3m 38s | Max:  3m 47s | Hits:  99%/3468  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  4h 30m | Avg:  7m 57s | Max: 28m 26s | Hits:  99%/29047 
      🟩 14                 Pass: 100%/37  | Total:  5h 03m | Avg:  8m 12s | Max: 25m 11s | Hits:  99%/31174 
      🟩 17                 Pass: 100%/36  | Total:  5h 18m | Avg:  8m 50s | Max: 26m 36s | Hits:  99%/30392 
      🟩 20                 Pass: 100%/24  | Total:  4h 06m | Avg: 10m 15s | Max: 26m 38s | Hits:  99%/20511 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 14h 55m | Avg: 7m 35s | Max: 21m 50s | Hits: 83%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 14h 05m | Avg:  7m 41s | Max: 21m 50s | Hits:  84%/129492
      🟩 arm64              Pass: 100%/8   | Total: 49m 40s | Avg:  6m 12s | Max:  6m 42s | Hits:  80%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 51m | Avg:  7m 25s | Max: 21m 50s | Hits:  76%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 12h 46m | Avg:  7m 40s | Max: 19m 45s | Hits:  85%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  6m 20s | Hits:  80%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 51m | Avg:  7m 25s | Max: 21m 50s | Hits:  76%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total: 12h 34m | Avg:  7m 41s | Max: 19m 45s | Hits:  85%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  6m 20s | Hits:  80%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 14h 43m | Avg:  7m 36s | Max: 21m 50s | Hits:  83%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 35m 41s | Avg:  5m 56s | Max:  6m 21s | Hits:  80%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 20m 09s | Avg:  6m 43s | Max:  7m 07s | Hits:  80%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 24m 10s | Avg:  6m 02s | Max:  6m 17s | Hits:  80%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 24m 30s | Avg:  6m 07s | Max:  6m 33s | Hits:  80%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 24m 13s | Avg:  6m 03s | Max:  6m 19s | Hits:  80%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 25m 05s | Avg:  6m 16s | Max:  6m 25s | Hits:  80%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 24m 27s | Avg:  6m 06s | Max:  6m 19s | Hits:  80%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 25m 45s | Avg:  6m 26s | Max:  6m 35s | Hits:  80%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 16m | Avg:  7m 36s | Max: 14m 41s | Hits:  89%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 10m 50s | Avg:  5m 25s | Max:  5m 39s | Hits:  80%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 47m 53s | Avg:  7m 58s | Max: 21m 50s | Hits:  70%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 33m 44s | Avg:  5m 37s | Max:  6m 14s | Hits:  80%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 34m 07s | Avg:  5m 41s | Max:  6m 09s | Hits:  80%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 24m 14s | Avg:  6m 03s | Max:  6m 30s | Hits:  80%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 35m 16s | Avg:  5m 02s | Max:  6m 22s | Hits:  88%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 25m 13s | Avg:  6m 18s | Max:  6m 46s | Hits:  80%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 35m | Avg:  7m 46s | Max: 19m 45s | Hits:  90%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 27m 52s | Avg:  9m 17s | Max: 10m 01s | Hits:  80%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 18m 48s | Avg: 18m 48s | Max: 18m 48s | Hits:  79%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 32m 43s | Avg: 16m 21s | Max: 16m 34s | Hits:  79%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 48m | Avg: 18m 04s | Max: 19m 27s | Hits:  89%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  5h 40m | Avg:  6m 41s | Max: 14m 41s | Hits:  83%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  6h 06m | Avg:  6m 40s | Max: 21m 50s | Hits:  83%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 27m 52s | Avg:  9m 17s | Max: 10m 01s | Hits:  80%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 39m | Avg: 17m 46s | Max: 19m 27s | Hits:  85%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 14h 55m | Avg:  7m 35s | Max: 21m 50s | Hits:  83%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 11h 14m | Avg:  6m 48s | Max: 21m 50s | Hits:  81%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 32s | Max: 19m 27s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 56m | Avg: 14m 31s | Max: 19m 45s | Hits:  95%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 32s | Avg:  5m 50s | Max:  6m 13s | Hits:  80%/3534  
      🟩 90a                Pass: 100%/4   | Total: 23m 29s | Avg:  5m 52s | Max:  6m 02s | Hits:  80%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 24m | Avg:  6m 49s | Max: 21m 50s | Hits:  81%/35328 
      🟩 14                 Pass: 100%/34  | Total:  4h 24m | Avg:  7m 47s | Max: 18m 48s | Hits:  83%/40020 
      🟩 17                 Pass: 100%/33  | Total:  4h 11m | Avg:  7m 36s | Max: 19m 27s | Hits:  84%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 54m | Avg:  8m 19s | Max: 19m 45s | Hits:  86%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 12m 15s | Avg: 12m 15s | Max: 12m 15s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber merged commit b761538 into NVIDIA:main Jul 26, 2024
263 of 265 checks passed
@bernhardmgruber bernhardmgruber deleted the partial_sum_type branch July 26, 2024 14:55
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
)

This allows us to get rid of partial_sum_type, which still uses the C++11-deprecated function object API ::result_type.

Co-authored-by: Georgii Evtushenko <[email protected]>
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
)

This allows us to get rid of partial_sum_type, which still uses the C++11-deprecated function object API ::result_type.

Co-authored-by: Georgii Evtushenko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
thrust For all items related to Thrust.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants