Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDAX] Add experimental owning abstraction for cudaStream_t #2093

Merged
merged 11 commits into from
Jul 30, 2024

Conversation

pciolkosz
Copy link
Contributor

This pull request adds an owning type cudax::stream for cudaStream_t.

Some functions in cudax::stream should go to cuda::stream_ref, like record and wait, but libcu++ can't depend on cudax. We could consider having two versions of stream_ref, once in cuda:: and second one in cudax::

@pciolkosz pciolkosz requested review from a team as code owners July 28, 2024 19:47
Copy link
Contributor

🟨 CI finished in 1h 44m: Pass: 99%/417 | Total: 2d 02h | Avg: 7m 16s | Max: 1h 19m | Hits: 97%/524552
  • 🟨 cub: Pass: 99%/131 | Total: 1d 00h | Avg: 11m 25s | Max: 1h 19m | Hits: 99%/110257

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total: 23h 29m | Avg: 11m 27s | Max:  1h 19m | Hits:  99%/103321
      🟩 arm64              Pass: 100%/8   | Total:  1h 27m | Avg: 10m 53s | Max: 54m 04s | Hits:  94%/6936  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 02m | Avg:  4m 09s | Max: 12m 38s | Hits:  99%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 13m 16s | Avg:  4m 25s | Max:  4m 46s | Hits:  99%/2601  
      🔍 12.5               Pass:  99%/113 | Total: 23h 41m | Avg: 12m 34s | Max:  1h 19m | Hits:  99%/95864 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 15s | Avg:  3m 37s | Max:  3m 43s | Hits: 100%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 02m | Avg:  4m 09s | Max: 12m 38s | Hits:  99%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 16s | Avg:  4m 25s | Max:  4m 46s | Hits:  99%/2601  
      🔍 nvcc12.5           Pass:  99%/111 | Total: 23h 33m | Avg: 12m 44s | Max:  1h 19m | Hits:  99%/94428 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 15s | Avg:  3m 37s | Max:  3m 43s | Hits: 100%/1436  
      🔍 nvcc               Pass:  99%/129 | Total:  1d 00h | Avg: 11m 32s | Max:  1h 19m | Hits:  99%/108821
    🔍 cxx: Clang17 🔍
      🟩 Clang9             Pass: 100%/6   | Total: 26m 38s | Avg:  4m 26s | Max:  5m 26s | Hits: 100%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 15m 31s | Avg:  5m 10s | Max:  5m 26s | Hits: 100%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 16m 58s | Avg:  4m 14s | Max:  4m 29s | Hits: 100%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 17m 49s | Avg:  4m 27s | Max:  4m 51s | Hits: 100%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 17m 33s | Avg:  4m 23s | Max:  4m 34s | Hits: 100%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 47s | Avg:  4m 26s | Max:  4m 43s | Hits: 100%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 17m 59s | Avg:  4m 29s | Max:  4m 32s | Hits: 100%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 18m 17s | Avg:  4m 34s | Max:  4m 50s | Hits: 100%/3468  
      🔍 Clang17            Pass:  96%/26  | Total:  8h 54m | Avg: 20m 34s | Max: 51m 37s | Hits:  99%/21377 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 57s | Avg:  3m 28s | Max:  3m 29s | Hits:  99%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 23m 01s | Avg:  3m 50s | Max:  4m 33s | Hits:  99%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 23m 30s | Avg:  3m 55s | Max:  4m 41s | Hits:  99%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 24m 52s | Avg:  4m 08s | Max:  4m 59s | Hits:  99%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 17m 05s | Avg:  4m 16s | Max:  4m 42s | Hits:  99%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 30m 02s | Avg:  4m 17s | Max:  4m 46s | Hits:  99%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 19m 14s | Avg:  4m 48s | Max:  5m 12s | Hits:  99%/3468  
      🟩 GCC13              Pass: 100%/28  | Total:  9h 48m | Avg: 21m 00s | Max:  1h 19m | Hits:  96%/24276 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 16m 19s | Avg:  5m 26s | Max:  5m 43s | Hits: 100%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 38s | Avg: 12m 38s | Max: 12m 38s | Hits:  99%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 20m 12s | Avg: 10m 06s | Max: 10m 06s | Hits:  99%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 31m 19s | Avg: 10m 26s | Max: 10m 54s | Hits:  99%/2127  
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  98%/59  | Total: 11h 23m | Avg: 11m 34s | Max: 51m 37s | Hits:  99%/49804 
      🟩 GCC                Pass: 100%/63  | Total: 12h 13m | Avg: 11m 38s | Max:  1h 19m | Hits:  98%/53820 
      🟩 Intel              Pass: 100%/3   | Total: 16m 19s | Avg:  5m 26s | Max:  5m 43s | Hits: 100%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 04m | Avg: 10m 41s | Max: 12m 38s | Hits:  99%/4254  
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/99  | Total:  8h 38m | Avg:  5m 14s | Max: 54m 04s | Hits:  99%/83380 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  4h 51m | Avg: 36m 24s | Max:  1h 19m | Hits:  94%/6936  
      🟩 GraphCapture       Pass: 100%/8   | Total:  3h 20m | Avg: 25m 05s | Max: 36m 30s | Hits:  99%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  3h 32m | Avg: 26m 30s | Max: 43m 28s | Hits:  99%/6936  
      🔍 TestGPU            Pass:  87%/8   | Total:  4h 33m | Avg: 34m 13s | Max: 51m 37s | Hits:  99%/6069  
    🔍 std: 17 🔍
      🟩 11                 Pass: 100%/34  | Total:  6h 45m | Avg: 11m 55s | Max:  1h 19m | Hits:  98%/29047 
      🟩 14                 Pass: 100%/37  | Total:  6h 37m | Avg: 10m 44s | Max: 54m 04s | Hits:  98%/31174 
      🔍 17                 Pass:  97%/36  | Total:  6h 26m | Avg: 10m 43s | Max: 43m 41s | Hits:  99%/29525 
      🟩 20                 Pass: 100%/24  | Total:  5h 07m | Avg: 12m 49s | Max: 51m 37s | Hits:  99%/20511 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total:  1d 00h | Avg: 11m 25s | Max:  1h 19m | Hits:  99%/110257
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 16s | Avg:  4m 25s | Max:  4m 46s | Hits:  99%/2601  
      🟩 90a                Pass: 100%/4   | Total: 14m 46s | Avg:  3m 41s | Max:  3m 59s | Hits:  99%/3468  
    
  • 🟩 thrust: Pass: 100%/118 | Total: 11h 17m | Avg: 5m 44s | Max: 30m 55s | Hits: 99%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 44m | Avg:  5m 51s | Max: 30m 55s | Hits:  99%/129492
      🟩 arm64              Pass: 100%/8   | Total: 32m 48s | Avg:  4m 06s | Max:  4m 25s | Hits:  99%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 56m 26s | Avg:  3m 45s | Max: 13m 36s | Hits:  99%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  3m 51s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 10h 09m | Avg:  6m 05s | Max: 30m 55s | Hits:  99%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  3m 34s | Hits: 100%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 56m 26s | Avg:  3m 45s | Max: 13m 36s | Hits:  99%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  3m 51s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total: 10h 02m | Avg:  6m 08s | Max: 30m 55s | Hits:  99%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  3m 34s | Hits: 100%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 11h 10m | Avg:  5m 46s | Max: 30m 55s | Hits:  99%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 22m 42s | Avg:  3m 47s | Max:  4m 22s | Hits: 100%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 12m 41s | Avg:  4m 13s | Max:  4m 21s | Hits: 100%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 13m 53s | Avg:  3m 28s | Max:  3m 32s | Hits: 100%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 14m 35s | Avg:  3m 38s | Max:  3m 50s | Hits: 100%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 15m 02s | Avg:  3m 45s | Max:  3m 55s | Hits: 100%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 14m 51s | Avg:  3m 42s | Max:  3m 49s | Hits: 100%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 15m 00s | Avg:  3m 45s | Max:  4m 16s | Hits: 100%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 15m 08s | Avg:  3m 47s | Max:  3m 58s | Hits: 100%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 14m | Avg:  7m 28s | Max: 26m 22s | Hits: 100%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  5m 43s | Avg:  2m 51s | Max:  2m 52s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 19m 07s | Avg:  3m 11s | Max:  3m 55s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 46m 49s | Avg:  7m 48s | Max: 30m 55s | Hits:  86%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 20m 23s | Avg:  3m 23s | Max:  3m 55s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 14m 49s | Avg:  3m 42s | Max:  4m 00s | Hits:  99%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 25m 45s | Avg:  3m 40s | Max:  3m 55s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 15m 32s | Avg:  3m 53s | Max:  4m 11s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 12m | Avg:  6m 36s | Max: 16m 46s | Hits:  99%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 13m 40s | Avg:  4m 33s | Max:  4m 46s | Hits: 100%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 36s | Avg: 13m 36s | Max: 13m 36s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 23m 08s | Avg: 11m 34s | Max: 12m 01s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 28m | Avg: 14m 43s | Max: 18m 18s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 18m | Avg:  5m 03s | Max: 26m 22s | Hits: 100%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  4h 40m | Avg:  5m 05s | Max: 30m 55s | Hits:  98%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 13m 40s | Avg:  4m 33s | Max:  4m 46s | Hits: 100%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 05m | Avg: 13m 53s | Max: 18m 18s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 11h 17m | Avg:  5m 44s | Max: 30m 55s | Hits:  99%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 22m | Avg:  4m 28s | Max: 30m 55s | Hits:  99%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 41m | Avg:  9m 11s | Max: 18m 18s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 13m | Avg: 16m 43s | Max: 26m 22s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  3m 51s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total: 13m 28s | Avg:  3m 22s | Max:  3m 35s | Hits:  99%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 22m | Avg:  4m 44s | Max: 26m 22s | Hits:  99%/35328 
      🟩 14                 Pass: 100%/34  | Total:  3h 13m | Avg:  5m 41s | Max: 16m 58s | Hits:  99%/40020 
      🟩 17                 Pass: 100%/33  | Total:  3h 34m | Avg:  6m 29s | Max: 30m 55s | Hits:  97%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 07m | Avg:  6m 03s | Max: 18m 18s | Hits:  99%/24717 
    
  • 🟩 libcudacxx: Pass: 100%/112 | Total: 11h 46m | Avg: 6m 18s | Max: 26m 15s | Hits: 96%/273250

    🟩 cpu
      🟩 amd64              Pass: 100%/104 | Total: 11h 12m | Avg:  6m 27s | Max: 26m 15s | Hits:  95%/250904
      🟩 arm64              Pass: 100%/8   | Total: 33m 50s | Avg:  4m 13s | Max:  4m 51s | Hits:  98%/22346 
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 54m 54s | Avg:  3m 39s | Max: 15m 17s | Hits:  98%/39780 
      🟩 11.8               Pass: 100%/3   | Total: 38m 44s | Avg: 12m 54s | Max: 19m 21s | Hits:  58%/8064  
      🟩 12.5               Pass: 100%/94  | Total: 10h 12m | Avg:  6m 31s | Max: 26m 15s | Hits:  96%/225406
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 35m 34s | Avg: 17m 47s | Max: 18m 07s | Hits:  37%/6099  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 54m 54s | Avg:  3m 39s | Max: 15m 17s | Hits:  98%/39780 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 38m 44s | Avg: 12m 54s | Max: 19m 21s | Hits:  58%/8064  
      🟩 nvcc12.5           Pass: 100%/92  | Total:  9h 37m | Avg:  6m 16s | Max: 26m 15s | Hits:  98%/219307
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 35m 34s | Avg: 17m 47s | Max: 18m 07s | Hits:  37%/6099  
      🟩 nvcc               Pass: 100%/110 | Total: 11h 10m | Avg:  6m 05s | Max: 26m 15s | Hits:  97%/267151
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 23m 48s | Avg:  3m 58s | Max:  5m 05s | Hits:  98%/16160 
      🟩 Clang10            Pass: 100%/3   | Total: 14m 41s | Avg:  4m 53s | Max:  5m 18s | Hits:  99%/8109  
      🟩 Clang11            Pass: 100%/4   | Total: 16m 08s | Avg:  4m 02s | Max:  4m 29s | Hits:  98%/11181 
      🟩 Clang12            Pass: 100%/4   | Total: 16m 39s | Avg:  4m 09s | Max:  5m 04s | Hits:  97%/11181 
      🟩 Clang13            Pass: 100%/4   | Total: 15m 24s | Avg:  3m 51s | Max:  4m 16s | Hits:  98%/11181 
      🟩 Clang14            Pass: 100%/4   | Total: 15m 44s | Avg:  3m 56s | Max:  4m 22s | Hits:  99%/11181 
      🟩 Clang15            Pass: 100%/4   | Total: 16m 55s | Avg:  4m 13s | Max:  5m 11s | Hits:  97%/11173 
      🟩 Clang16            Pass: 100%/4   | Total: 16m 00s | Avg:  4m 00s | Max:  4m 25s | Hits:  98%/11173 
      🟩 Clang17            Pass: 100%/14  | Total:  2h 16m | Avg:  9m 44s | Max: 19m 11s | Hits:  85%/28445 
      🟩 GCC6               Pass: 100%/2   | Total:  5m 15s | Avg:  2m 37s | Max:  2m 43s | Hits:  98%/5045  
      🟩 GCC7               Pass: 100%/6   | Total: 17m 33s | Avg:  2m 55s | Max:  3m 24s | Hits:  99%/16146 
      🟩 GCC8               Pass: 100%/6   | Total: 19m 36s | Avg:  3m 16s | Max:  4m 17s | Hits:  98%/16154 
      🟩 GCC9               Pass: 100%/6   | Total: 19m 25s | Avg:  3m 14s | Max:  4m 31s | Hits:  97%/16158 
      🟩 GCC10              Pass: 100%/4   | Total: 13m 41s | Avg:  3m 25s | Max:  3m 40s | Hits:  98%/11181 
      🟩 GCC11              Pass: 100%/7   | Total: 53m 03s | Avg:  7m 34s | Max: 19m 21s | Hits:  81%/19237 
      🟩 GCC12              Pass: 100%/4   | Total: 14m 54s | Avg:  3m 43s | Max:  4m 39s | Hits:  98%/11173 
      🟩 GCC13              Pass: 100%/21  | Total:  3h 22m | Avg:  9m 38s | Max: 26m 15s | Hits:  98%/33902 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 14m 49s | Avg:  4m 56s | Max:  5m 25s | Hits:  98%/8099  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 17s | Avg: 15m 17s | Max: 15m 17s | Hits:  99%/2536  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 21m 46s | Avg: 10m 53s | Max: 11m 11s | Hits:  98%/5434  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 36m 34s | Avg: 12m 11s | Max: 12m 45s | Hits:  98%/8401  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/47  | Total:  4h 31m | Avg:  5m 46s | Max: 19m 11s | Hits:  95%/119784
      🟩 GCC                Pass: 100%/56  | Total:  5h 46m | Avg:  6m 10s | Max: 26m 15s | Hits:  95%/128996
      🟩 Intel              Pass: 100%/3   | Total: 14m 49s | Avg:  4m 56s | Max:  5m 25s | Hits:  98%/8099  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 13m | Avg: 12m 16s | Max: 15m 17s | Hits:  98%/16371 
    🟩 gpu
      🟩 v100               Pass: 100%/112 | Total: 11h 46m | Avg:  6m 18s | Max: 26m 15s | Hits:  96%/273250
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  8h 00m | Avg:  4m 51s | Max: 19m 21s | Hits:  96%/273230
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 24m | Avg: 21m 04s | Max: 26m 15s | Hits: 100%/20    
      🟩 Test               Pass: 100%/8   | Total:  2h 19m | Avg: 17m 25s | Max: 20m 45s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 49s | Avg:  1m 49s | Max:  1m 49s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 38m 44s | Avg: 12m 54s | Max: 19m 21s | Hits:  58%/8064  
      🟩 90a                Pass: 100%/4   | Total: 13m 19s | Avg:  3m 19s | Max:  3m 40s | Hits:  99%/11536 
    🟩 std
      🟩 11                 Pass: 100%/29  | Total:  2h 12m | Avg:  4m 33s | Max: 16m 35s | Hits:  99%/58200 
      🟩 14                 Pass: 100%/32  | Total:  3h 26m | Avg:  6m 27s | Max: 25m 14s | Hits:  96%/81788 
      🟩 17                 Pass: 100%/31  | Total:  3h 33m | Avg:  6m 53s | Max: 20m 45s | Hits:  94%/84134 
      🟩 20                 Pass: 100%/19  | Total:  2h 31m | Avg:  7m 59s | Max: 26m 15s | Hits:  94%/49128 
    
  • 🟩 cudax: Pass: 100%/55 | Total: 2h 25m | Avg: 2m 38s | Max: 6m 59s | Hits: 90%/2133

    🟩 cpu
      🟩 amd64              Pass: 100%/51  | Total:  2h 15m | Avg:  2m 39s | Max:  6m 59s | Hits:  90%/1977  
      🟩 arm64              Pass: 100%/4   | Total:  9m 52s | Avg:  2m 28s | Max:  2m 50s | Hits:  89%/156   
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 01m | Avg:  2m 39s | Max:  6m 05s | Hits:  90%/891   
      🟩 12.5               Pass: 100%/32  | Total:  1h 24m | Avg:  2m 37s | Max:  6m 59s | Hits:  90%/1242  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 01m | Avg:  2m 39s | Max:  6m 05s | Hits:  90%/891   
      🟩 nvcc12.5           Pass: 100%/32  | Total:  1h 24m | Avg:  2m 37s | Max:  6m 59s | Hits:  90%/1242  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/55  | Total:  2h 25m | Avg:  2m 38s | Max:  6m 59s | Hits:  90%/2133  
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  4m 16s | Avg:  2m 08s | Max:  2m 15s | Hits:  92%/78    
      🟩 Clang10            Pass: 100%/2   | Total:  4m 09s | Avg:  2m 04s | Max:  2m 12s | Hits:  92%/78    
      🟩 Clang11            Pass: 100%/4   | Total:  8m 11s | Avg:  2m 02s | Max:  2m 10s | Hits:  92%/156   
      🟩 Clang12            Pass: 100%/4   | Total:  8m 07s | Avg:  2m 01s | Max:  2m 12s | Hits:  92%/156   
      🟩 Clang13            Pass: 100%/4   | Total:  8m 32s | Avg:  2m 08s | Max:  2m 37s | Hits:  92%/156   
      🟩 Clang14            Pass: 100%/6   | Total: 18m 36s | Avg:  3m 06s | Max:  5m 08s | Hits:  94%/234   
      🟩 Clang15            Pass: 100%/2   | Total:  4m 26s | Avg:  2m 13s | Max:  2m 20s | Hits:  92%/78    
      🟩 Clang16            Pass: 100%/6   | Total: 19m 20s | Avg:  3m 13s | Max:  5m 14s | Hits:  94%/234   
      🟩 GCC9               Pass: 100%/2   | Total:  3m 52s | Avg:  1m 56s | Max:  2m 03s | Hits:  87%/78    
      🟩 GCC10              Pass: 100%/4   | Total:  7m 39s | Avg:  1m 54s | Max:  2m 03s | Hits:  87%/156   
      🟩 GCC11              Pass: 100%/4   | Total:  7m 27s | Avg:  1m 51s | Max:  1m 56s | Hits:  87%/156   
      🟩 GCC12              Pass: 100%/12  | Total: 34m 48s | Avg:  2m 54s | Max:  4m 58s | Hits:  89%/468   
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  2m 45s | Avg:  2m 45s | Max:  2m 45s | Hits:  89%/39    
      🟩 MSVC14.36          Pass: 100%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s | Hits:  66%/33    
      🟩 MSVC14.39          Pass: 100%/1   | Total:  6m 59s | Avg:  6m 59s | Max:  6m 59s | Hits:  66%/33    
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 15m | Avg:  2m 31s | Max:  5m 14s | Hits:  93%/1170  
      🟩 GCC                Pass: 100%/22  | Total: 53m 46s | Avg:  2m 26s | Max:  4m 58s | Hits:  88%/858   
      🟩 Intel              Pass: 100%/1   | Total:  2m 45s | Avg:  2m 45s | Max:  2m 45s | Hits:  89%/39    
      🟩 MSVC               Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max:  6m 59s | Hits:  66%/66    
    🟩 gpu
      🟩 v100               Pass: 100%/55  | Total:  2h 25m | Avg:  2m 38s | Max:  6m 59s | Hits:  90%/2133  
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  1h 47m | Avg:  2m 16s | Max:  6m 59s | Hits:  89%/1821  
      🟩 Test               Pass: 100%/8   | Total: 38m 00s | Avg:  4m 45s | Max:  5m 14s | Hits:  97%/312   
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  1m 44s | Avg:  1m 44s | Max:  1m 44s | Hits:  87%/39    
      🟩 90a                Pass: 100%/1   | Total:  2m 05s | Avg:  2m 05s | Max:  2m 05s | Hits:  87%/39    
    🟩 std
      🟩 17                 Pass: 100%/31  | Total:  1h 15m | Avg:  2m 26s | Max:  5m 14s | Hits:  91%/1209  
      🟩 20                 Pass: 100%/24  | Total:  1h 09m | Avg:  2m 54s | Max:  6m 59s | Hits:  89%/924   
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 10m 28s | Avg: 10m 28s | Max: 10m 28s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 417)

# Runner
305 linux-amd64-cpu16
61 linux-amd64-gpu-v100-latest-1
28 linux-arm64-cpu16
23 windows-amd64-cpu16

cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
//! @brief Constructs a stream on a specified device and with specified priority
//!
//! @throws cuda_error if stream creation fails
explicit stream(device __dev, int __priority)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love if in new APIs we could try to be const correct:

Suggested change
explicit stream(device __dev, int __priority)
explicit stream(const device __dev, const int __priority)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are passed by value, is there any value in having them const?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of the user these are identical. From our perspective it would guard against accidentally changing them. I'm generally not a fan of top level qualifiers on function arguments, and they are very much an implementation detail, so we can add or remove them in any API at any time.

cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
Comment on lines +73 to +79
//! @brief Construct a new `stream` object into the moved-from state.
//!
//! @post `stream()` returns an invalid stream handle
// Can't be constexpr because invalid_stream isn't
explicit stream(uninit_t) noexcept
: stream_ref(detail::invalid_stream)
{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a public constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This aligns with the event constructor from uninit and works as an opt-in to create a stream that will be assigned into later

cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
cudax/include/cuda/experimental/__event/event.cuh Outdated Show resolved Hide resolved
cudax/include/cuda/experimental/stream.cuh Outdated Show resolved Hide resolved
cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
Comment on lines +175 to +184
_CCCL_NODISCARD static stream from_native_handle(::cudaStream_t __handle)
{
return stream(__handle);
}

// Disallow construction from an `int`, e.g., `0`.
static stream from_native_handle(int) = delete;

// Disallow construction from `nullptr`.
static stream from_native_handle(_CUDA_VSTD::nullptr_t) = delete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why shouldn't those be ctors? What problem are factory functions solving here that ctors cannot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one aligns with event factory function, we can discuss it as a broader design question for cudax. These are taking the ownership of the stream, so I like the explicitness of the function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericniebler points out that these functions take ownership over the passed in stream, so he wants them to stand out in the code

*
* \return value representing the priority of the wrapped stream.
*/
_CCCL_NODISCARD int priority() const
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericniebler Do we want to return a strong type?

Copy link
Collaborator

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I've always thought an owning stream abstraction should just be a unique_ptr with a custom deleter that overrides the pointer type like:

struct stream_deleter{
   using pointer = cudaStream_t;
   void operator()(cudaStream_t s){ cudaStreamDestroy(s); }
};

struct owning_stream{
// other stuff
private:
   std::unique_ptr<cudaStream_t, stream_deleter> __s;
};

Does that not work?

cudax/include/cuda/experimental/__stream/stream.cuh Outdated Show resolved Hide resolved
{
// TODO consider an optimization to not create an event every time and instead have one persistent event or one per
// stream
assert(__stream.get() != nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't __stream a cudaStream_t here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be check for invalid_stream, corrected

@ericniebler ericniebler linked an issue Jul 29, 2024 that may be closed by this pull request
Copy link
Contributor

🟩 CI finished in 8h 04m: Pass: 100%/417 | Total: 1d 20h | Avg: 6m 24s | Max: 37m 49s | Hits: 98%/525419
  • 🟩 cub: Pass: 100%/131 | Total: 19h 36m | Avg: 8m 58s | Max: 37m 49s | Hits: 99%/111124

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total: 18h 58m | Avg:  9m 15s | Max: 37m 49s | Hits:  99%/104188
      🟩 arm64              Pass: 100%/8   | Total: 38m 29s | Avg:  4m 48s | Max:  5m 28s | Hits:  99%/6936  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 04m | Avg:  4m 19s | Max: 12m 16s | Hits:  99%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 01s | Hits:  99%/2601  
      🟩 12.5               Pass: 100%/113 | Total: 18h 18m | Avg:  9m 43s | Max: 37m 49s | Hits:  99%/96731 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  3m 40s | Hits: 100%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 04m | Avg:  4m 19s | Max: 12m 16s | Hits:  99%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 01s | Hits:  99%/2601  
      🟩 nvcc12.5           Pass: 100%/111 | Total: 18h 10m | Avg:  9m 49s | Max: 37m 49s | Hits:  99%/95295 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  3m 40s | Hits: 100%/1436  
      🟩 nvcc               Pass: 100%/129 | Total: 19h 29m | Avg:  9m 03s | Max: 37m 49s | Hits:  99%/109688
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 26m 49s | Avg:  4m 28s | Max:  5m 16s | Hits: 100%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 15m 26s | Avg:  5m 08s | Max:  5m 22s | Hits: 100%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 17m 17s | Avg:  4m 19s | Max:  4m 24s | Hits: 100%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 17m 55s | Avg:  4m 28s | Max:  4m 44s | Hits: 100%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 18m 08s | Avg:  4m 32s | Max:  4m 49s | Hits: 100%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 17m 58s | Avg:  4m 29s | Max:  4m 42s | Hits: 100%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 18m 30s | Avg:  4m 37s | Max:  4m 44s | Hits: 100%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 18m 03s | Avg:  4m 30s | Max:  4m 45s | Hits: 100%/3468  
      🟩 Clang17            Pass: 100%/26  | Total:  6h 06m | Avg: 14m 06s | Max: 25m 31s | Hits:  99%/22244 
      🟩 GCC6               Pass: 100%/2   | Total:  7m 11s | Avg:  3m 35s | Max:  3m 57s | Hits:  99%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 24m 18s | Avg:  4m 03s | Max:  4m 29s | Hits:  99%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 23m 45s | Avg:  3m 57s | Max:  4m 20s | Hits:  99%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 25m 19s | Avg:  4m 13s | Max:  4m 40s | Hits:  99%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 18m 05s | Avg:  4m 31s | Max:  4m 42s | Hits:  99%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 31m 54s | Avg:  4m 33s | Max:  5m 01s | Hits:  99%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  4m 58s | Hits:  99%/3468  
      🟩 GCC13              Pass: 100%/28  | Total:  7h 09m | Avg: 15m 21s | Max: 37m 49s | Hits:  99%/24276 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 24s | Avg:  5m 08s | Max:  5m 28s | Hits: 100%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 16s | Avg: 12m 16s | Max: 12m 16s | Hits:  99%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 20m 39s | Avg: 10m 19s | Max: 10m 39s | Hits:  99%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 32m 38s | Avg: 10m 52s | Max: 11m 43s | Hits:  99%/2127  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  8h 36m | Avg:  8m 45s | Max: 25m 31s | Hits:  99%/50671 
      🟩 GCC                Pass: 100%/63  | Total:  9h 39m | Avg:  9m 11s | Max: 37m 49s | Hits:  99%/53820 
      🟩 Intel              Pass: 100%/3   | Total: 15m 24s | Avg:  5m 08s | Max:  5m 28s | Hits: 100%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 05m | Avg: 10m 55s | Max: 12m 16s | Hits:  99%/4254  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total: 19h 36m | Avg:  8m 58s | Max: 37m 49s | Hits:  99%/111124
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 56m | Avg:  4m 48s | Max: 12m 16s | Hits:  99%/83380 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 37m | Avg: 19m 42s | Max: 26m 01s | Hits:  99%/6936  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 31m | Avg: 18m 59s | Max: 33m 08s | Hits:  99%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 39m | Avg: 19m 57s | Max: 21m 59s | Hits:  99%/6936  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 51m | Avg: 28m 53s | Max: 37m 49s | Hits:  99%/6936  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 01s | Hits:  99%/2601  
      🟩 90a                Pass: 100%/4   | Total: 14m 22s | Avg:  3m 35s | Max:  3m 44s | Hits:  99%/3468  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  4h 25m | Avg:  7m 48s | Max: 24m 31s | Hits:  99%/29047 
      🟩 14                 Pass: 100%/37  | Total:  5h 17m | Avg:  8m 34s | Max: 37m 04s | Hits:  99%/31174 
      🟩 17                 Pass: 100%/36  | Total:  5h 27m | Avg:  9m 06s | Max: 34m 46s | Hits:  99%/30392 
      🟩 20                 Pass: 100%/24  | Total:  4h 26m | Avg: 11m 05s | Max: 37m 49s | Hits:  99%/20511 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 10h 57m | Avg: 5m 34s | Max: 32m 13s | Hits: 99%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 25m | Avg:  5m 41s | Max: 32m 13s | Hits:  99%/129492
      🟩 arm64              Pass: 100%/8   | Total: 32m 25s | Avg:  4m 03s | Max:  4m 31s | Hits:  99%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 58m 00s | Avg:  3m 52s | Max: 13m 15s | Hits:  99%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 11m 25s | Avg:  3m 48s | Max:  4m 01s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  9h 48m | Avg:  5m 52s | Max: 32m 13s | Hits:  99%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 01s | Avg:  3m 30s | Max:  3m 36s | Hits: 100%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 58m 00s | Avg:  3m 52s | Max: 13m 15s | Hits:  99%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 25s | Avg:  3m 48s | Max:  4m 01s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  9h 41m | Avg:  5m 55s | Max: 32m 13s | Hits:  99%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 01s | Avg:  3m 30s | Max:  3m 36s | Hits: 100%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 10h 50m | Avg:  5m 36s | Max: 32m 13s | Hits:  99%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 22m 48s | Avg:  3m 48s | Max:  4m 19s | Hits: 100%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 12m 53s | Avg:  4m 17s | Max:  4m 32s | Hits: 100%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 14m 55s | Avg:  3m 43s | Max:  3m 53s | Hits: 100%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 15m 03s | Avg:  3m 45s | Max:  3m 54s | Hits: 100%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 15m 10s | Avg:  3m 47s | Max:  4m 05s | Hits: 100%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 14m 40s | Avg:  3m 40s | Max:  3m 46s | Hits: 100%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 15m 02s | Avg:  3m 45s | Max:  4m 03s | Hits: 100%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 15m 17s | Avg:  3m 49s | Max:  4m 01s | Hits: 100%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  1h 56m | Avg:  6m 29s | Max: 16m 12s | Hits:  99%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 00s | Avg:  3m 00s | Max:  3m 01s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 20m 55s | Avg:  3m 29s | Max:  3m 55s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 20m 23s | Avg:  3m 23s | Max:  3m 52s | Hits:  99%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 20m 17s | Avg:  3m 22s | Max:  3m 55s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 43m 41s | Avg: 10m 55s | Max: 32m 13s | Hits:  80%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 25m 55s | Avg:  3m 42s | Max:  4m 04s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 15m 46s | Avg:  3m 56s | Max:  4m 31s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 04m | Avg:  6m 14s | Max: 18m 49s | Hits:  99%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 13m 54s | Avg:  4m 38s | Max:  5m 00s | Hits: 100%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 15s | Avg: 13m 15s | Max: 13m 15s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 22m 33s | Avg: 11m 16s | Max: 11m 36s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 27m | Avg: 14m 37s | Max: 19m 34s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 02m | Avg:  4m 45s | Max: 16m 12s | Hits:  99%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  4h 37m | Avg:  5m 02s | Max: 32m 13s | Hits:  98%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 13m 54s | Avg:  4m 38s | Max:  5m 00s | Hits: 100%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 03m | Avg: 13m 43s | Max: 19m 34s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 10h 57m | Avg:  5m 34s | Max: 32m 13s | Hits:  99%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 27m | Avg:  4m 31s | Max: 32m 13s | Hits:  99%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 41m | Avg:  9m 14s | Max: 19m 34s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 48m | Avg: 13m 36s | Max: 18m 49s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 25s | Avg:  3m 48s | Max:  4m 01s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total: 13m 20s | Avg:  3m 20s | Max:  3m 29s | Hits:  99%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 15m | Avg:  4m 31s | Max: 18m 49s | Hits:  99%/35328 
      🟩 14                 Pass: 100%/34  | Total:  3h 33m | Avg:  6m 17s | Max: 32m 13s | Hits:  97%/40020 
      🟩 17                 Pass: 100%/33  | Total:  3h 01m | Avg:  5m 29s | Max: 19m 34s | Hits:  99%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 06m | Avg:  6m 01s | Max: 17m 51s | Hits:  99%/24717 
    
  • 🟩 libcudacxx: Pass: 100%/112 | Total: 11h 18m | Avg: 6m 03s | Max: 20m 34s | Hits: 97%/273250

    🟩 cpu
      🟩 amd64              Pass: 100%/104 | Total: 10h 44m | Avg:  6m 12s | Max: 20m 34s | Hits:  97%/250904
      🟩 arm64              Pass: 100%/8   | Total: 33m 18s | Avg:  4m 09s | Max:  4m 48s | Hits:  98%/22346 
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 55m 35s | Avg:  3m 42s | Max: 16m 26s | Hits:  99%/39780 
      🟩 11.8               Pass: 100%/3   | Total: 25m 55s | Avg:  8m 38s | Max: 20m 22s | Hits:  86%/8064  
      🟩 12.5               Pass: 100%/94  | Total:  9h 56m | Avg:  6m 20s | Max: 20m 34s | Hits:  97%/225406
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 18m 30s | Hits:  37%/6099  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 55m 35s | Avg:  3m 42s | Max: 16m 26s | Hits:  99%/39780 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 25m 55s | Avg:  8m 38s | Max: 20m 22s | Hits:  86%/8064  
      🟩 nvcc12.5           Pass: 100%/92  | Total:  9h 21m | Avg:  6m 06s | Max: 20m 34s | Hits:  98%/219307
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 18m 30s | Hits:  37%/6099  
      🟩 nvcc               Pass: 100%/110 | Total: 10h 43m | Avg:  5m 50s | Max: 20m 34s | Hits:  98%/267151
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 24m 28s | Avg:  4m 04s | Max:  5m 25s | Hits:  99%/16160 
      🟩 Clang10            Pass: 100%/3   | Total: 15m 25s | Avg:  5m 08s | Max:  5m 33s | Hits:  98%/8109  
      🟩 Clang11            Pass: 100%/4   | Total: 17m 14s | Avg:  4m 18s | Max:  5m 04s | Hits:  98%/11181 
      🟩 Clang12            Pass: 100%/4   | Total: 15m 32s | Avg:  3m 53s | Max:  4m 08s | Hits:  98%/11181 
      🟩 Clang13            Pass: 100%/4   | Total: 15m 39s | Avg:  3m 54s | Max:  4m 16s | Hits:  98%/11181 
      🟩 Clang14            Pass: 100%/4   | Total: 16m 18s | Avg:  4m 04s | Max:  4m 24s | Hits:  97%/11181 
      🟩 Clang15            Pass: 100%/4   | Total: 16m 06s | Avg:  4m 01s | Max:  4m 28s | Hits:  98%/11173 
      🟩 Clang16            Pass: 100%/4   | Total: 15m 29s | Avg:  3m 52s | Max:  4m 11s | Hits:  99%/11173 
      🟩 Clang17            Pass: 100%/14  | Total:  2h 20m | Avg: 10m 01s | Max: 20m 34s | Hits:  85%/28445 
      🟩 GCC6               Pass: 100%/2   | Total:  5m 21s | Avg:  2m 40s | Max:  2m 48s | Hits:  98%/5045  
      🟩 GCC7               Pass: 100%/6   | Total: 17m 17s | Avg:  2m 52s | Max:  3m 26s | Hits:  98%/16146 
      🟩 GCC8               Pass: 100%/6   | Total: 18m 29s | Avg:  3m 04s | Max:  3m 51s | Hits:  98%/16154 
      🟩 GCC9               Pass: 100%/6   | Total: 18m 33s | Avg:  3m 05s | Max:  3m 59s | Hits:  99%/16158 
      🟩 GCC10              Pass: 100%/4   | Total: 14m 23s | Avg:  3m 35s | Max:  3m 55s | Hits:  98%/11181 
      🟩 GCC11              Pass: 100%/7   | Total: 39m 36s | Avg:  5m 39s | Max: 20m 22s | Hits:  93%/19237 
      🟩 GCC12              Pass: 100%/4   | Total: 14m 41s | Avg:  3m 40s | Max:  3m 57s | Hits:  98%/11173 
      🟩 GCC13              Pass: 100%/21  | Total:  3h 00m | Avg:  8m 36s | Max: 20m 04s | Hits:  99%/33902 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 14m 55s | Avg:  4m 58s | Max:  5m 20s | Hits:  99%/8099  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 16m 26s | Avg: 16m 26s | Max: 16m 26s | Hits:  99%/2536  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 10s | Hits:  98%/5434  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 39m 13s | Avg: 13m 04s | Max: 14m 52s | Hits:  98%/8401  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/47  | Total:  4h 36m | Avg:  5m 53s | Max: 20m 34s | Hits:  95%/119784
      🟩 GCC                Pass: 100%/56  | Total:  5h 09m | Avg:  5m 31s | Max: 20m 22s | Hits:  98%/128996
      🟩 Intel              Pass: 100%/3   | Total: 14m 55s | Avg:  4m 58s | Max:  5m 20s | Hits:  99%/8099  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 17m | Avg: 12m 55s | Max: 16m 26s | Hits:  98%/16371 
    🟩 gpu
      🟩 v100               Pass: 100%/112 | Total: 11h 18m | Avg:  6m 03s | Max: 20m 34s | Hits:  97%/273250
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 50m | Avg:  4m 44s | Max: 20m 22s | Hits:  97%/273230
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 12m | Avg: 18m 06s | Max: 20m 04s | Hits: 100%/20    
      🟩 Test               Pass: 100%/8   | Total:  2h 13m | Avg: 16m 41s | Max: 20m 34s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 55s | Avg:  1m 55s | Max:  1m 55s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 25m 55s | Avg:  8m 38s | Max: 20m 22s | Hits:  86%/8064  
      🟩 90a                Pass: 100%/4   | Total: 13m 39s | Avg:  3m 24s | Max:  3m 37s | Hits:  99%/11536 
    🟩 std
      🟩 11                 Pass: 100%/29  | Total:  2h 26m | Avg:  5m 03s | Max: 20m 22s | Hits:  98%/58200 
      🟩 14                 Pass: 100%/32  | Total:  3h 11m | Avg:  5m 58s | Max: 20m 03s | Hits:  98%/81788 
      🟩 17                 Pass: 100%/31  | Total:  3h 15m | Avg:  6m 19s | Max: 20m 34s | Hits:  96%/84134 
      🟩 20                 Pass: 100%/19  | Total:  2h 22m | Avg:  7m 30s | Max: 20m 04s | Hits:  94%/49128 
    
  • 🟩 cudax: Pass: 100%/55 | Total: 2h 27m | Avg: 2m 40s | Max: 6m 36s | Hits: 73%/2133

    🟩 cpu
      🟩 amd64              Pass: 100%/51  | Total:  2h 16m | Avg:  2m 40s | Max:  6m 36s | Hits:  72%/1977  
      🟩 arm64              Pass: 100%/4   | Total: 10m 15s | Avg:  2m 33s | Max:  2m 41s | Hits:  89%/156   
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 01m | Avg:  2m 41s | Max:  6m 36s | Hits:  72%/891   
      🟩 12.5               Pass: 100%/32  | Total:  1h 25m | Avg:  2m 39s | Max:  6m 36s | Hits:  74%/1242  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 01m | Avg:  2m 41s | Max:  6m 36s | Hits:  72%/891   
      🟩 nvcc12.5           Pass: 100%/32  | Total:  1h 25m | Avg:  2m 39s | Max:  6m 36s | Hits:  74%/1242  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/55  | Total:  2h 27m | Avg:  2m 40s | Max:  6m 36s | Hits:  73%/2133  
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  4m 51s | Avg:  2m 25s | Max:  2m 27s | Hits:  56%/78    
      🟩 Clang10            Pass: 100%/2   | Total:  4m 40s | Avg:  2m 20s | Max:  2m 26s | Hits:  56%/78    
      🟩 Clang11            Pass: 100%/4   | Total:  9m 13s | Avg:  2m 18s | Max:  2m 24s | Hits:  56%/156   
      🟩 Clang12            Pass: 100%/4   | Total:  9m 35s | Avg:  2m 23s | Max:  2m 31s | Hits:  56%/156   
      🟩 Clang13            Pass: 100%/4   | Total: 10m 43s | Avg:  2m 40s | Max:  3m 29s | Hits:  57%/156   
      🟩 Clang14            Pass: 100%/6   | Total: 17m 39s | Avg:  2m 56s | Max:  5m 01s | Hits:  86%/234   
      🟩 Clang15            Pass: 100%/2   | Total:  4m 13s | Avg:  2m 06s | Max:  2m 10s | Hits:  79%/78    
      🟩 Clang16            Pass: 100%/6   | Total: 17m 55s | Avg:  2m 59s | Max:  4m 00s | Hits:  90%/234   
      🟩 GCC9               Pass: 100%/2   | Total:  4m 06s | Avg:  2m 03s | Max:  2m 05s | Hits:  74%/78    
      🟩 GCC10              Pass: 100%/4   | Total:  8m 04s | Avg:  2m 01s | Max:  2m 07s | Hits:  74%/156   
      🟩 GCC11              Pass: 100%/4   | Total:  7m 51s | Avg:  1m 57s | Max:  2m 02s | Hits:  74%/156   
      🟩 GCC12              Pass: 100%/12  | Total: 31m 53s | Avg:  2m 39s | Max:  3m 33s | Hits:  83%/468   
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s | Hits:  56%/39    
      🟩 MSVC14.36          Pass: 100%/1   | Total:  6m 36s | Avg:  6m 36s | Max:  6m 36s | Hits:  66%/33    
      🟩 MSVC14.39          Pass: 100%/1   | Total:  6m 36s | Avg:  6m 36s | Max:  6m 36s | Hits:  66%/33    
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 18m | Avg:  2m 37s | Max:  5m 01s | Hits:  70%/1170  
      🟩 GCC                Pass: 100%/22  | Total: 51m 54s | Avg:  2m 21s | Max:  3m 33s | Hits:  79%/858   
      🟩 Intel              Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s | Hits:  56%/39    
      🟩 MSVC               Pass: 100%/2   | Total: 13m 12s | Avg:  6m 36s | Max:  6m 36s | Hits:  66%/66    
    🟩 gpu
      🟩 v100               Pass: 100%/55  | Total:  2h 27m | Avg:  2m 40s | Max:  6m 36s | Hits:  73%/2133  
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  1h 56m | Avg:  2m 28s | Max:  6m 36s | Hits:  69%/1821  
      🟩 Test               Pass: 100%/8   | Total: 30m 52s | Avg:  3m 51s | Max:  5m 01s | Hits:  97%/312   
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s | Hits:  74%/39    
      🟩 90a                Pass: 100%/1   | Total:  2m 18s | Avg:  2m 18s | Max:  2m 18s | Hits:  74%/39    
    🟩 std
      🟩 17                 Pass: 100%/31  | Total:  1h 18m | Avg:  2m 31s | Max:  5m 01s | Hits:  72%/1209  
      🟩 20                 Pass: 100%/24  | Total:  1h 08m | Avg:  2m 51s | Max:  6m 36s | Hits:  75%/924   
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 30s | Avg: 11m 30s | Max: 11m 30s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 417)

# Runner
305 linux-amd64-cpu16
61 linux-amd64-gpu-v100-latest-1
28 linux-arm64-cpu16
23 windows-amd64-cpu16

@miscco miscco merged commit 15e2ce0 into NVIDIA:main Jul 30, 2024
432 checks passed
@pciolkosz pciolkosz linked an issue Jul 30, 2024 that may be closed by this pull request
pciolkosz added a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
…2093)

* construct with a stream_ref and record the event on construction

---------

Co-authored-by: Eric Niebler <[email protected]>
pciolkosz added a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
…2093)

* construct with a stream_ref and record the event on construction

---------

Co-authored-by: Eric Niebler <[email protected]>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

add a way to associate a stream with a device Add stream abstraction
6 participants