Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guard against an overflow in sort tests #1980

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

bernhardmgruber
Copy link
Contributor

Fixes: #1979

@bernhardmgruber bernhardmgruber added bug Something isn't working right. cub For all items related to CUB labels Jul 11, 2024
Copy link
Contributor

🟨 CI finished in 2h 24m: Pass: 98%/249 | Total: 1d 18h | Avg: 10m 12s | Max: 42m 26s | Hits: 90%/244804
  • 🟨 cub: Pass: 96%/131 | Total: 1d 02h | Avg: 11m 55s | Max: 41m 48s | Hits: 89%/105892

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  96%/123 | Total: 23h 09m | Avg: 11m 17s | Max: 41m 48s | Hits:  91%/99068 
      🟩 arm64              Pass: 100%/8   | Total:  2h 52m | Avg: 21m 31s | Max: 39m 52s | Hits:  57%/6824  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 17m | Avg:  5m 09s | Max: 15m 21s | Hits:  99%/11583 
      🟩 11.8               Pass: 100%/3   | Total: 18m 15s | Avg:  6m 05s | Max:  6m 11s | Hits:  99%/2559  
      🔍 12.5               Pass:  96%/113 | Total:  1d 00h | Avg: 12m 58s | Max: 41m 48s | Hits:  87%/91750 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  9m 09s | Avg:  4m 34s | Max:  4m 38s | Hits:  99%/1410  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 17m | Avg:  5m 09s | Max: 15m 21s | Hits:  99%/11583 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 18m 15s | Avg:  6m 05s | Max:  6m 11s | Hits:  99%/2559  
      🔍 nvcc12.5           Pass:  96%/111 | Total:  1d 00h | Avg: 13m 07s | Max: 41m 48s | Hits:  87%/90340 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 09s | Avg:  4m 34s | Max:  4m 38s | Hits:  99%/1410  
      🔍 nvcc               Pass:  96%/129 | Total:  1d 01h | Avg: 12m 01s | Max: 41m 48s | Hits:  89%/104482
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 32m 02s | Avg:  5m 20s | Max:  6m 00s | Hits:  99%/4896  
      🟩 Clang10            Pass: 100%/3   | Total: 18m 28s | Avg:  6m 09s | Max:  6m 22s | Hits:  99%/2565  
      🟩 Clang11            Pass: 100%/4   | Total: 21m 35s | Avg:  5m 23s | Max:  5m 35s | Hits:  99%/3420  
      🟩 Clang12            Pass: 100%/4   | Total: 21m 59s | Avg:  5m 29s | Max:  5m 38s | Hits:  99%/3420  
      🟩 Clang13            Pass: 100%/4   | Total: 21m 57s | Avg:  5m 29s | Max:  5m 32s | Hits:  99%/3420  
      🟩 Clang14            Pass: 100%/4   | Total: 21m 22s | Avg:  5m 20s | Max:  5m 35s | Hits:  99%/3420  
      🟩 Clang15            Pass: 100%/4   | Total: 21m 02s | Avg:  5m 15s | Max:  5m 27s | Hits:  99%/3412  
      🟩 Clang16            Pass: 100%/4   | Total: 21m 17s | Avg:  5m 19s | Max:  5m 33s | Hits:  99%/3412  
      🟨 Clang17            Pass:  92%/26  | Total:  5h 39m | Avg: 13m 02s | Max: 31m 02s | Hits:  99%/20176 
      🟩 GCC6               Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  4m 14s | Hits:  99%/1554  
      🟩 GCC7               Pass: 100%/6   | Total: 29m 43s | Avg:  4m 57s | Max:  5m 34s | Hits:  99%/4899  
      🟩 GCC8               Pass: 100%/6   | Total: 28m 18s | Avg:  4m 43s | Max:  5m 28s | Hits:  99%/4899  
      🟩 GCC9               Pass: 100%/6   | Total: 29m 08s | Avg:  4m 51s | Max:  5m 24s | Hits:  99%/4899  
      🟩 GCC10              Pass: 100%/4   | Total: 21m 31s | Avg:  5m 22s | Max:  5m 39s | Hits:  99%/3420  
      🟩 GCC11              Pass: 100%/7   | Total: 40m 29s | Avg:  5m 47s | Max:  6m 11s | Hits:  99%/5971  
      🟩 GCC12              Pass: 100%/4   | Total: 22m 53s | Avg:  5m 43s | Max:  5m 57s | Hits:  99%/3412  
      🟨 GCC13              Pass:  92%/28  | Total: 10h 54m | Avg: 23m 22s | Max: 39m 52s | Hits:  60%/22178 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 59m | Avg: 39m 57s | Max: 41m 48s | Hits:   3%/2343  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 21s | Avg: 15m 21s | Max: 15m 21s | Hits:  98%/696   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 29m 01s | Avg: 14m 30s | Max: 14m 32s | Hits:  98%/1392  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 43m 23s | Avg: 14m 27s | Max: 14m 53s | Hits:  98%/2088  
    🟨 cxx_family
      🟨 Clang              Pass:  96%/59  | Total:  8h 38m | Avg:  8m 47s | Max: 31m 02s | Hits:  99%/48141 
      🟨 GCC                Pass:  96%/63  | Total: 13h 54m | Avg: 13m 15s | Max: 39m 52s | Hits:  82%/51232 
      🟩 Intel              Pass: 100%/3   | Total:  1h 59m | Avg: 39m 57s | Max: 41m 48s | Hits:   3%/2343  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 27m | Avg: 14m 37s | Max: 15m 21s | Hits:  98%/4176  
    🟨 jobs
      🟩 Build              Pass: 100%/99  | Total: 16h 35m | Avg: 10m 03s | Max: 41m 48s | Hits:  86%/82008 
      🟨 DeviceLaunch       Pass:  75%/8   | Total:  2h 14m | Avg: 16m 46s | Max: 33m 07s | Hits:  99%/5118  
      🟩 GraphCapture       Pass: 100%/8   | Total:  1h 57m | Avg: 14m 42s | Max: 17m 08s | Hits:  99%/6824  
      🟨 HostLaunch         Pass:  87%/8   | Total:  2h 06m | Avg: 15m 46s | Max: 19m 55s | Hits:  99%/5971  
      🟨 TestGPU            Pass:  87%/8   | Total:  3h 08m | Avg: 23m 31s | Max: 31m 02s | Hits:  99%/5971  
    🟨 std
      🟨 11                 Pass:  94%/34  | Total:  6h 02m | Avg: 10m 40s | Max: 41m 48s | Hits:  88%/26867 
      🟨 14                 Pass:  97%/37  | Total:  6h 55m | Avg: 11m 13s | Max: 39m 36s | Hits:  89%/29808 
      🟩 17                 Pass: 100%/36  | Total:  7h 35m | Avg: 12m 39s | Max: 39m 52s | Hits:  89%/29893 
      🟨 20                 Pass:  95%/24  | Total:  5h 27m | Avg: 13m 38s | Max: 37m 15s | Hits:  90%/19324 
    🟨 gpu
      🟨 v100               Pass:  96%/131 | Total:  1d 02h | Avg: 11m 55s | Max: 41m 48s | Hits:  89%/105892
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 18m 15s | Avg:  6m 05s | Max:  6m 11s | Hits:  99%/2559  
      🟩 90a                Pass: 100%/4   | Total:  1h 22m | Avg: 20m 33s | Max: 22m 26s | Hits:  16%/3412  
    
  • 🟩 thrust: Pass: 100%/118 | Total: 16h 19m | Avg: 8m 18s | Max: 42m 26s | Hits: 92%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 13h 59m | Avg:  7m 37s | Max: 42m 26s | Hits:  94%/129492
      🟩 arm64              Pass: 100%/8   | Total:  2h 20m | Avg: 17m 33s | Max: 32m 58s | Hits:  54%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 27m | Avg:  5m 51s | Max: 22m 38s | Hits:  94%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 48s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 14h 37m | Avg:  8m 46s | Max: 42m 26s | Hits:  91%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  9m 08s | Avg:  4m 34s | Max:  4m 51s | Hits:  99%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 27m | Avg:  5m 51s | Max: 22m 38s | Hits:  94%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 48s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total: 14h 28m | Avg:  8m 51s | Max: 42m 26s | Hits:  91%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 08s | Avg:  4m 34s | Max:  4m 51s | Hits:  99%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 16h 10m | Avg:  8m 21s | Max: 42m 26s | Hits:  91%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 26m 51s | Avg:  4m 28s | Max:  5m 00s | Hits:  99%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 14m 57s | Avg:  4m 59s | Max:  5m 12s | Hits:  99%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 16m 43s | Avg:  4m 10s | Max:  4m 21s | Hits:  99%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 17m 14s | Avg:  4m 18s | Max:  4m 29s | Hits:  99%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 17m 26s | Avg:  4m 21s | Max:  4m 27s | Hits:  99%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 11s | Avg:  4m 17s | Max:  4m 26s | Hits:  99%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 18m 05s | Avg:  4m 31s | Max:  4m 54s | Hits:  99%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 17m 13s | Avg:  4m 18s | Max:  4m 28s | Hits:  99%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  1h 59m | Avg:  6m 38s | Max: 13m 23s | Hits:  99%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  7m 20s | Avg:  3m 40s | Max:  3m 45s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 23m 56s | Avg:  3m 59s | Max:  4m 52s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 42m 40s | Avg:  7m 06s | Max: 22m 38s | Hits:  86%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 23m 40s | Avg:  3m 56s | Max:  4m 13s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 16m 57s | Avg:  4m 14s | Max:  4m 25s | Hits:  99%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 31m 27s | Avg:  4m 29s | Max:  4m 48s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 18m 10s | Avg:  4m 32s | Max:  4m 47s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  4h 39m | Avg: 13m 59s | Max: 32m 58s | Hits:  73%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 04m | Avg: 41m 23s | Max: 42m 26s | Hits:   2%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 16m 00s | Avg: 16m 00s | Max: 16m 00s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 29m 33s | Avg: 14m 46s | Max: 15m 11s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 40m | Avg: 16m 45s | Max: 20m 15s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 25m | Avg:  5m 11s | Max: 13m 23s | Hits:  99%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  7h 24m | Avg:  8m 04s | Max: 32m 58s | Hits:  88%/64788 
      🟩 Intel              Pass: 100%/3   | Total:  2h 04m | Avg: 41m 23s | Max: 42m 26s | Hits:   2%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 26m | Avg: 16m 13s | Max: 20m 15s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 16h 19m | Avg:  8m 18s | Max: 42m 26s | Hits:  92%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 12h 55m | Avg:  7m 50s | Max: 42m 26s | Hits:  90%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 31s | Max: 20m 15s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 39m | Avg: 12m 24s | Max: 13m 40s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 44s | Avg:  4m 34s | Max:  4m 48s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total:  1h 04m | Avg: 16m 04s | Max: 18m 26s | Hits:  58%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 48m | Avg:  7m 36s | Max: 39m 27s | Hits:  89%/35328 
      🟩 14                 Pass: 100%/34  | Total:  4h 49m | Avg:  8m 31s | Max: 42m 26s | Hits:  92%/40020 
      🟩 17                 Pass: 100%/33  | Total:  4h 38m | Avg:  8m 26s | Max: 42m 16s | Hits:  92%/38847 
      🟩 20                 Pass: 100%/21  | Total:  3h 02m | Avg:  8m 42s | Max: 32m 58s | Hits:  93%/24717 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental

🏃‍ Runner counts (total jobs: 249)

# Runner
178 linux-amd64-cpu16
40 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber marked this pull request as ready for review July 11, 2024 22:14
@bernhardmgruber bernhardmgruber requested review from a team as code owners July 11, 2024 22:14
Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still overflow on index_to_key_value_op end. Without corresponding change, the subsequent comparison will not yield right results.

Before you start fixing overflow in index_to_key_value_op, I assume that there's a reason to have + 1 here. If omitting + 1 is possible, we should omit it for every type. If omitting + 1 is not possible, we should have it for every type.

Given that we never actually test with std::size_t, I'd suggest to add a static assert and safe some time by tabling investigation till the moment when we actually need to test against std::size_t.

@bernhardmgruber
Copy link
Contributor Author

There's still overflow on index_to_key_value_op end. Without corresponding change, the subsequent comparison will not yield right results.

Before you start fixing overflow in index_to_key_value_op, I assume that there's a reason to have + 1 here.

The +1 is needed to give you the correct answer. A uint16 has max value (2^16-1) - lowest value (0) + 1 = 2^16 different values. That works because the calculations use a size_t. If UnsignedIntegralKeyT is a size_t itself, then the calculation would be 2^64-1 - 0 + 1. The last +1 overflows the result.

If omitting + 1 is possible, we should omit it for every type. If omitting + 1 is not possible, we should have it for every type.

Omitting the + 1 introduces a small error, but I figured because the num_total_items is also a size_t, it can actually never be larger than the num_distinct_key_values, so the error never occurs. I have not verified this to the fullest TBH.

Given that we never actually test with std::size_t, I'd suggest to add a static assert and safe some time by tabling investigation till the moment when we actually need to test against std::size_t.

That confused me initially, because I looked at the offset_types list and saw uint64_t. But you made me look a second time, and I noticed that they key types are coming from elsewhere and really never use a size_t. Thus, adding the static_assert is for sure the safest bet! Thank you for noticing this!

@bernhardmgruber bernhardmgruber changed the title Avoid overflow in sort tests Guard against an overflow in sort tests Jul 15, 2024
@bernhardmgruber bernhardmgruber enabled auto-merge (squash) July 15, 2024 23:30
Copy link
Contributor

🟩 CI finished in 17h 43m: Pass: 100%/250 | Total: 1d 07h | Avg: 7m 31s | Max: 32m 10s | Hits: 99%/248216
  • 🟩 cub: Pass: 100%/131 | Total: 19h 54m | Avg: 9m 07s | Max: 32m 10s | Hits: 99%/109304

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total: 19h 05m | Avg:  9m 18s | Max: 32m 10s | Hits:  99%/102480
      🟩 arm64              Pass: 100%/8   | Total: 49m 22s | Avg:  6m 10s | Max:  6m 35s | Hits:  99%/6824  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 15m | Avg:  5m 00s | Max: 13m 30s | Hits:  99%/11583 
      🟩 11.8               Pass: 100%/3   | Total: 17m 49s | Avg:  5m 56s | Max:  6m 00s | Hits:  99%/2559  
      🟩 12.5               Pass: 100%/113 | Total: 18h 21m | Avg:  9m 45s | Max: 32m 10s | Hits:  99%/95162 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 46s | Hits:  99%/1410  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 15m | Avg:  5m 00s | Max: 13m 30s | Hits:  99%/11583 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 49s | Avg:  5m 56s | Max:  6m 00s | Hits:  99%/2559  
      🟩 nvcc12.5           Pass: 100%/111 | Total: 18h 12m | Avg:  9m 50s | Max: 32m 10s | Hits:  99%/93752 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 46s | Hits:  99%/1410  
      🟩 nvcc               Pass: 100%/129 | Total: 19h 45m | Avg:  9m 11s | Max: 32m 10s | Hits:  99%/107894
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 31m 23s | Avg:  5m 13s | Max:  6m 07s | Hits:  99%/4896  
      🟩 Clang10            Pass: 100%/3   | Total: 18m 15s | Avg:  6m 05s | Max:  6m 12s | Hits:  99%/2565  
      🟩 Clang11            Pass: 100%/4   | Total: 20m 44s | Avg:  5m 11s | Max:  5m 25s | Hits:  99%/3420  
      🟩 Clang12            Pass: 100%/4   | Total: 21m 00s | Avg:  5m 15s | Max:  5m 27s | Hits:  99%/3420  
      🟩 Clang13            Pass: 100%/4   | Total: 21m 41s | Avg:  5m 25s | Max:  5m 40s | Hits:  99%/3420  
      🟩 Clang14            Pass: 100%/4   | Total: 21m 03s | Avg:  5m 15s | Max:  5m 30s | Hits:  99%/3420  
      🟩 Clang15            Pass: 100%/4   | Total: 21m 06s | Avg:  5m 16s | Max:  5m 24s | Hits:  99%/3412  
      🟩 Clang16            Pass: 100%/4   | Total: 21m 42s | Avg:  5m 25s | Max:  5m 42s | Hits:  99%/3412  
      🟩 Clang17            Pass: 100%/26  | Total:  5h 57m | Avg: 13m 45s | Max: 27m 52s | Hits:  99%/21882 
      🟩 GCC6               Pass: 100%/2   | Total:  8m 48s | Avg:  4m 24s | Max:  4m 39s | Hits:  99%/1554  
      🟩 GCC7               Pass: 100%/6   | Total: 55m 15s | Avg:  9m 12s | Max: 32m 10s | Hits:  92%/4899  
      🟩 GCC8               Pass: 100%/6   | Total: 29m 16s | Avg:  4m 52s | Max:  5m 38s | Hits:  99%/4899  
      🟩 GCC9               Pass: 100%/6   | Total: 29m 32s | Avg:  4m 55s | Max:  5m 33s | Hits:  99%/4899  
      🟩 GCC10              Pass: 100%/4   | Total: 20m 24s | Avg:  5m 06s | Max:  5m 13s | Hits:  99%/3420  
      🟩 GCC11              Pass: 100%/7   | Total: 38m 25s | Avg:  5m 29s | Max:  6m 00s | Hits:  99%/5971  
      🟩 GCC12              Pass: 100%/4   | Total: 21m 18s | Avg:  5m 19s | Max:  5m 31s | Hits:  99%/3412  
      🟩 GCC13              Pass: 100%/28  | Total:  6h 03m | Avg: 12m 59s | Max: 26m 09s | Hits:  99%/23884 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 18m 18s | Avg:  6m 06s | Max:  6m 10s | Hits:  99%/2343  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 30s | Avg: 13m 30s | Max: 13m 30s | Hits:  98%/696   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 23m 37s | Avg: 11m 48s | Max: 11m 49s | Hits:  98%/1392  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 37m 43s | Avg: 12m 34s | Max: 12m 43s | Hits:  98%/2088  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  8h 54m | Avg:  9m 03s | Max: 27m 52s | Hits:  99%/49847 
      🟩 GCC                Pass: 100%/63  | Total:  9h 26m | Avg:  8m 59s | Max: 32m 10s | Hits:  98%/52938 
      🟩 Intel              Pass: 100%/3   | Total: 18m 18s | Avg:  6m 06s | Max:  6m 10s | Hits:  99%/2343  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 14m | Avg: 12m 28s | Max: 13m 30s | Hits:  98%/4176  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total: 19h 54m | Avg:  9m 07s | Max: 32m 10s | Hits:  99%/109304
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  9h 51m | Avg:  5m 58s | Max: 32m 10s | Hits:  98%/82008 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 25m | Avg: 18m 10s | Max: 21m 02s | Hits:  99%/6824  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 03m | Avg: 15m 27s | Max: 17m 39s | Hits:  99%/6824  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 16m | Avg: 17m 05s | Max: 19m 22s | Hits:  99%/6824  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 17m | Avg: 24m 41s | Max: 27m 52s | Hits:  99%/6824  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 49s | Avg:  5m 56s | Max:  6m 00s | Hits:  99%/2559  
      🟩 90a                Pass: 100%/4   | Total: 17m 16s | Avg:  4m 19s | Max:  4m 36s | Hits:  99%/3412  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  5h 08m | Avg:  9m 04s | Max: 32m 10s | Hits:  98%/28573 
      🟩 14                 Pass: 100%/37  | Total:  5h 25m | Avg:  8m 47s | Max: 27m 52s | Hits:  99%/30661 
      🟩 17                 Pass: 100%/36  | Total:  5h 18m | Avg:  8m 51s | Max: 24m 05s | Hits:  99%/29893 
      🟩 20                 Pass: 100%/24  | Total:  4h 02m | Avg: 10m 05s | Max: 23m 12s | Hits:  99%/20177 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 11h 13m | Avg: 5m 42s | Max: 18m 37s | Hits: 99%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 39m | Avg:  5m 49s | Max: 18m 37s | Hits:  99%/129492
      🟩 arm64              Pass: 100%/8   | Total: 33m 53s | Avg:  4m 14s | Max:  4m 34s | Hits:  99%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 05m | Avg:  4m 22s | Max: 15m 12s | Hits:  99%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 12m 23s | Avg:  4m 07s | Max:  4m 20s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  9h 55m | Avg:  5m 57s | Max: 18m 37s | Hits:  99%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 06s | Avg:  4m 03s | Max:  4m 09s | Hits:  99%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 05m | Avg:  4m 22s | Max: 15m 12s | Hits:  99%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 12m 23s | Avg:  4m 07s | Max:  4m 20s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  9h 47m | Avg:  5m 59s | Max: 18m 37s | Hits:  99%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 06s | Avg:  4m 03s | Max:  4m 09s | Hits:  99%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 11h 05m | Avg:  5m 44s | Max: 18m 37s | Hits:  99%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 26m 24s | Avg:  4m 24s | Max:  5m 11s | Hits:  99%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 14m 47s | Avg:  4m 55s | Max:  5m 17s | Hits:  99%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 17m 04s | Avg:  4m 16s | Max:  4m 33s | Hits:  99%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 16m 32s | Avg:  4m 08s | Max:  4m 15s | Hits:  99%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 16m 55s | Avg:  4m 13s | Max:  4m 19s | Hits:  99%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 56s | Avg:  4m 29s | Max:  4m 41s | Hits:  99%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 17m 56s | Avg:  4m 29s | Max:  4m 55s | Hits:  99%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 17m 23s | Avg:  4m 20s | Max:  4m 34s | Hits:  99%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  1h 58m | Avg:  6m 36s | Max: 14m 23s | Hits:  99%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  7m 11s | Avg:  3m 35s | Max:  3m 45s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 22m 57s | Avg:  3m 49s | Max:  4m 38s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 22m 48s | Avg:  3m 48s | Max:  4m 19s | Hits:  99%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 23m 10s | Avg:  3m 51s | Max:  4m 32s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 17m 21s | Avg:  4m 20s | Max:  4m 25s | Hits:  99%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 29m 54s | Avg:  4m 16s | Max:  4m 43s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 17m 58s | Avg:  4m 29s | Max:  4m 47s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 03m | Avg:  6m 11s | Max: 13m 01s | Hits:  98%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 10s | Avg:  5m 03s | Max:  5m 25s | Hits:  99%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 12s | Avg: 15m 12s | Max: 15m 12s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 24m 03s | Avg: 12m 01s | Max: 12m 24s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 30m | Avg: 15m 05s | Max: 18m 37s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 23m | Avg:  5m 10s | Max: 14m 23s | Hits:  99%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  4h 25m | Avg:  4m 49s | Max: 13m 01s | Hits:  99%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 15m 10s | Avg:  5m 03s | Max:  5m 25s | Hits:  99%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 09m | Avg: 14m 25s | Max: 18m 37s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 11h 13m | Avg:  5m 42s | Max: 18m 37s | Hits:  99%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 52m | Avg:  4m 46s | Max: 15m 12s | Hits:  99%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 43m | Avg:  9m 22s | Max: 18m 37s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 38m | Avg: 12m 16s | Max: 14m 23s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 12m 23s | Avg:  4m 07s | Max:  4m 20s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total: 14m 04s | Avg:  3m 31s | Max:  3m 43s | Hits:  99%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 24m | Avg:  4m 48s | Max: 14m 23s | Hits:  99%/35328 
      🟩 14                 Pass: 100%/34  | Total:  3h 23m | Avg:  5m 59s | Max: 16m 48s | Hits:  99%/40020 
      🟩 17                 Pass: 100%/33  | Total:  3h 13m | Avg:  5m 52s | Max: 18m 37s | Hits:  99%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 11m | Avg:  6m 16s | Max: 17m 54s | Hits:  99%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber merged commit 2a3fea9 into NVIDIA:main Jul 16, 2024
264 checks passed
@bernhardmgruber bernhardmgruber deleted the overflow_sort branch July 16, 2024 17:14
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right. cub For all items related to CUB
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

CUB device merge sort tests overflow
2 participants