Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix g++-14 warning on uninitialized copying #2157

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

bernhardmgruber
Copy link
Contributor

This PR fixes a warning with g++-14 (and nvbug 4748765). Here is the summary:

In function bool cuda::std::__4::__dispatch_memmove(_Up*, _Tp*, size_t)
...
error: *(unsigned char*)(&privatized_decode_op[0]) may be used uninitialized [-Werror=maybe-uninitialized]
...
*(unsigned char*)(&privatized_decode_op[0]) was declared here
 1528 |       PrivatizedDecodeOpT privatized_decode_op[NUM_ACTIVE_CHANNELS]{};

PrivatizedDecodeOpT is PassThruTransform in this context, which is a struct of no data members. Therefore, it occupies 1 byte of storage, but is technically empty. The value initialization seems to not touch any bytes, so the later memmove complains it is transferring unwritten bytes.

I cannot suppress this warning, since it occurs at the locus of __dispatch_memmove, so I went for adding a dummy char data member to PassThruTransform. This way (I assume) value initializing does zero the memory before memmove touches it.

```
In function bool cuda::std::__4::__dispatch_memmove(_Up*, _Tp*, size_t)
...
error: *(unsigned char*)(&privatized_decode_op[0]) may be used uninitialized [-Werror=maybe-uninitialized]
...
*(unsigned char*)(&privatized_decode_op[0]) was declared here
 1528 |       PrivatizedDecodeOpT privatized_decode_op[NUM_ACTIVE_CHANNELS]{};
```
@bernhardmgruber bernhardmgruber marked this pull request as ready for review August 1, 2024 00:46
@bernhardmgruber bernhardmgruber requested review from a team as code owners August 1, 2024 00:46
@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Aug 1, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

🟨 CI finished in 2h 43m: Pass: 99%/250 | Total: 1d 09h | Avg: 8m 00s | Max: 33m 06s | Hits: 98%/249175
  • 🟨 cub: Pass: 99%/131 | Total: 21h 58m | Avg: 10m 03s | Max: 32m 46s | Hits: 98%/110263

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total: 20h 58m | Avg: 10m 13s | Max: 32m 46s | Hits:  98%/103327
      🟩 arm64              Pass: 100%/8   | Total:  1h 00m | Avg:  7m 31s | Max:  8m 45s | Hits:  97%/6936  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 30m | Avg:  6m 03s | Max: 15m 22s | Hits:  98%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🔍 12.5               Pass:  99%/113 | Total: 20h 04m | Avg: 10m 39s | Max: 32m 46s | Hits:  98%/95870 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 15s | Hits:  98%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 30m | Avg:  6m 03s | Max: 15m 22s | Hits:  98%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🔍 nvcc12.5           Pass:  99%/111 | Total: 19h 53m | Avg: 10m 45s | Max: 32m 46s | Hits:  98%/94434 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 15s | Hits:  98%/1436  
      🔍 nvcc               Pass:  99%/129 | Total: 21h 48m | Avg: 10m 08s | Max: 32m 46s | Hits:  98%/108827
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total: 35m 59s | Avg:  5m 59s | Max:  6m 48s | Hits:  98%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 20m 45s | Avg:  6m 55s | Max:  7m 27s | Hits:  98%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 24m 21s | Avg:  6m 05s | Max:  7m 16s | Hits:  98%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 24m 11s | Avg:  6m 02s | Max:  6m 36s | Hits:  98%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 23m 51s | Avg:  5m 57s | Max:  6m 34s | Hits:  98%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 24m 44s | Avg:  6m 11s | Max:  7m 09s | Hits:  98%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 25m 21s | Avg:  6m 20s | Max:  7m 15s | Hits:  98%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 24m 34s | Avg:  6m 08s | Max:  6m 58s | Hits:  98%/3468  
      🟩 Clang17            Pass: 100%/26  | Total:  6h 46m | Avg: 15m 37s | Max: 30m 57s | Hits:  99%/22244 
      🟩 GCC6               Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  5m 43s | Hits:  98%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 34m 09s | Avg:  5m 41s | Max:  6m 23s | Hits:  97%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 34m 26s | Avg:  5m 44s | Max:  6m 46s | Hits:  97%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 34m 14s | Avg:  5m 42s | Max:  6m 34s | Hits:  97%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 25m 16s | Avg:  6m 19s | Max:  6m 51s | Hits:  97%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 47m 38s | Avg:  6m 48s | Max:  8m 38s | Hits:  97%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 24m 35s | Avg:  6m 08s | Max:  6m 42s | Hits:  97%/3468  
      🔍 GCC13              Pass:  96%/28  | Total:  6h 38m | Avg: 14m 13s | Max: 32m 46s | Hits:  98%/23409 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 22m 01s | Avg:  7m 20s | Max:  7m 58s | Hits:  98%/2385  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 22s | Avg: 15m 22s | Max: 15m 22s | Hits:  97%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 24m 57s | Avg: 12m 28s | Max: 13m 03s | Hits:  97%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 36m 45s | Avg: 12m 15s | Max: 12m 30s | Hits:  97%/2127  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/59  | Total: 10h 09m | Avg: 10m 20s | Max: 30m 57s | Hits:  98%/50671 
      🔍 GCC                Pass:  98%/63  | Total: 10h 09m | Avg:  9m 40s | Max: 32m 46s | Hits:  98%/52953 
      🟩 Intel              Pass: 100%/3   | Total: 22m 01s | Avg:  7m 20s | Max:  7m 58s | Hits:  98%/2385  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 17m | Avg: 12m 50s | Max: 15m 22s | Hits:  97%/4254  
    🔍 jobs: DeviceLaunch 🔍
      🟩 Build              Pass: 100%/99  | Total: 10h 54m | Avg:  6m 36s | Max: 15m 22s | Hits:  97%/83386 
      🔍 DeviceLaunch       Pass:  87%/8   | Total:  2h 27m | Avg: 18m 29s | Max: 24m 54s | Hits:  99%/6069  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 22m | Avg: 17m 52s | Max: 22m 33s | Hits:  99%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 38m | Avg: 19m 45s | Max: 24m 29s | Hits:  99%/6936  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 34m | Avg: 26m 51s | Max: 32m 46s | Hits:  99%/6936  
    🔍 std: 11 🔍
      🔍 11                 Pass:  97%/34  | Total:  5h 02m | Avg:  8m 53s | Max: 23m 02s | Hits:  98%/28182 
      🟩 14                 Pass: 100%/37  | Total:  6h 20m | Avg: 10m 17s | Max: 30m 57s | Hits:  98%/31176 
      🟩 17                 Pass: 100%/36  | Total:  5h 59m | Avg:  9m 59s | Max: 32m 46s | Hits:  98%/30394 
      🟩 20                 Pass: 100%/24  | Total:  4h 35m | Avg: 11m 28s | Max: 29m 04s | Hits:  98%/20511 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total: 21h 58m | Avg: 10m 03s | Max: 32m 46s | Hits:  98%/110263
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🟩 90a                Pass: 100%/4   | Total: 18m 01s | Avg:  4m 30s | Max:  4m 51s | Hits:  97%/3468  
    
  • 🟩 thrust: Pass: 100%/118 | Total: 11h 10m | Avg: 5m 40s | Max: 33m 06s | Hits: 99%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 08m | Avg:  5m 32s | Max: 18m 53s | Hits:  99%/129492
      🟩 arm64              Pass: 100%/8   | Total:  1h 01m | Avg:  7m 41s | Max: 33m 06s | Hits:  90%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 57m 10s | Avg:  3m 48s | Max: 12m 31s | Hits:  99%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 10h 01m | Avg:  6m 01s | Max: 33m 06s | Hits:  99%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 30s | Avg:  3m 45s | Max:  3m 52s | Hits: 100%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 57m 10s | Avg:  3m 48s | Max: 12m 31s | Hits:  99%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  9h 54m | Avg:  6m 03s | Max: 33m 06s | Hits:  99%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 30s | Avg:  3m 45s | Max:  3m 52s | Hits: 100%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 11h 02m | Avg:  5m 42s | Max: 33m 06s | Hits:  99%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 23m 10s | Avg:  3m 51s | Max:  4m 20s | Hits: 100%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 13m 01s | Avg:  4m 20s | Max:  4m 51s | Hits: 100%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 14m 46s | Avg:  3m 41s | Max:  3m 51s | Hits: 100%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 14m 47s | Avg:  3m 41s | Max:  4m 06s | Hits: 100%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 14m 58s | Avg:  3m 44s | Max:  3m 55s | Hits: 100%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 15m 04s | Avg:  3m 46s | Max:  3m 52s | Hits: 100%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 15m 23s | Avg:  3m 50s | Max:  4m 02s | Hits: 100%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 15m 07s | Avg:  3m 46s | Max:  3m 58s | Hits: 100%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 02m | Avg:  6m 46s | Max: 18m 02s | Hits: 100%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 00s | Avg:  3m 00s | Max:  3m 06s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 19m 17s | Avg:  3m 12s | Max:  3m 41s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 20m 06s | Avg:  3m 21s | Max:  3m 45s | Hits:  99%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 20m 34s | Avg:  3m 25s | Max:  3m 44s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 15m 23s | Avg:  3m 50s | Max:  4m 08s | Hits:  99%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 26m 17s | Avg:  3m 45s | Max:  4m 02s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 15m 11s | Avg:  3m 47s | Max:  4m 09s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 43m | Avg:  8m 10s | Max: 33m 06s | Hits:  95%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 13m 24s | Avg:  4m 28s | Max:  4m 34s | Hits: 100%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 22m 13s | Avg: 11m 06s | Max: 11m 08s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 27m | Avg: 14m 38s | Max: 18m 30s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 08m | Avg:  4m 52s | Max: 18m 02s | Hits: 100%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  4h 46m | Avg:  5m 12s | Max: 33m 06s | Hits:  98%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 13m 24s | Avg:  4m 28s | Max:  4m 34s | Hits: 100%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 02m | Avg: 13m 37s | Max: 18m 30s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 11h 10m | Avg:  5m 40s | Max: 33m 06s | Hits:  99%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 25m | Avg:  4m 29s | Max: 33m 06s | Hits:  99%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 43m | Avg:  9m 24s | Max: 18m 30s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 01m | Avg: 15m 12s | Max: 18m 53s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total: 13m 22s | Avg:  3m 20s | Max:  3m 28s | Hits:  99%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 15m | Avg:  4m 31s | Max: 18m 02s | Hits:  99%/35328 
      🟩 14                 Pass: 100%/34  | Total:  3h 08m | Avg:  5m 32s | Max: 17m 06s | Hits:  99%/40020 
      🟩 17                 Pass: 100%/33  | Total:  3h 03m | Avg:  5m 34s | Max: 18m 10s | Hits:  99%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 42m | Avg:  7m 44s | Max: 33m 06s | Hits:  96%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Comment on lines +848 to +852
// GCC 14 rightfully warns that when a value-initialized array of this struct is copied using memcpy, uninitialized
// bytes may be accessed. To avoid this, we add a dummy member, so value initialization actually initializes the memory.
#if defined(_CCCL_COMPILER_GCC) && __GNUC__ == 14
char dummy;
#endif
Copy link
Collaborator

@miscco miscco Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Would it suffice to provide a user defined default constructor that does nothing? That should technically zero out struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try!

Copy link
Contributor Author

@bernhardmgruber bernhardmgruber Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not solve the issue and I get the same error message. That is, adding a constructor PassThruTransform() {} does not prevent the warning.

@bernhardmgruber
Copy link
Contributor Author

I checked the code generated by nvcc --keep, but there is nothing unusual in it. Also, I fail to reproduce the issue with g++-14 on compiler explorer :S It seems like a g++-14 bug only occuring in our case, since the error is produced by nvcc when it invokes g++-14 as host compiler on catch2_test_device_histogram.cudafe1.cpp.

Copy link
Contributor

github-actions bot commented Aug 1, 2024

🟩 CI finished in 12h 31m: Pass: 100%/250 | Total: 1d 09h | Avg: 8m 02s | Max: 33m 06s | Hits: 98%/250042
  • 🟩 cub: Pass: 100%/131 | Total: 22h 10m | Avg: 10m 09s | Max: 32m 46s | Hits: 98%/111130

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total: 21h 09m | Avg: 10m 19s | Max: 32m 46s | Hits:  98%/104194
      🟩 arm64              Pass: 100%/8   | Total:  1h 00m | Avg:  7m 31s | Max:  8m 45s | Hits:  97%/6936  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 30m | Avg:  6m 03s | Max: 15m 22s | Hits:  98%/11792 
      🟩 11.8               Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🟩 12.5               Pass: 100%/113 | Total: 20h 15m | Avg: 10m 45s | Max: 32m 46s | Hits:  98%/96737 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 15s | Hits:  98%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 30m | Avg:  6m 03s | Max: 15m 22s | Hits:  98%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🟩 nvcc12.5           Pass: 100%/111 | Total: 20h 05m | Avg: 10m 51s | Max: 32m 46s | Hits:  98%/95301 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 15s | Hits:  98%/1436  
      🟩 nvcc               Pass: 100%/129 | Total: 21h 59m | Avg: 10m 13s | Max: 32m 46s | Hits:  98%/109694
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 35m 59s | Avg:  5m 59s | Max:  6m 48s | Hits:  98%/4980  
      🟩 Clang10            Pass: 100%/3   | Total: 20m 45s | Avg:  6m 55s | Max:  7m 27s | Hits:  98%/2607  
      🟩 Clang11            Pass: 100%/4   | Total: 24m 21s | Avg:  6m 05s | Max:  7m 16s | Hits:  98%/3476  
      🟩 Clang12            Pass: 100%/4   | Total: 24m 11s | Avg:  6m 02s | Max:  6m 36s | Hits:  98%/3476  
      🟩 Clang13            Pass: 100%/4   | Total: 23m 51s | Avg:  5m 57s | Max:  6m 34s | Hits:  98%/3476  
      🟩 Clang14            Pass: 100%/4   | Total: 24m 44s | Avg:  6m 11s | Max:  7m 09s | Hits:  98%/3476  
      🟩 Clang15            Pass: 100%/4   | Total: 25m 21s | Avg:  6m 20s | Max:  7m 15s | Hits:  98%/3468  
      🟩 Clang16            Pass: 100%/4   | Total: 24m 34s | Avg:  6m 08s | Max:  6m 58s | Hits:  98%/3468  
      🟩 Clang17            Pass: 100%/26  | Total:  6h 46m | Avg: 15m 37s | Max: 30m 57s | Hits:  99%/22244 
      🟩 GCC6               Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  5m 43s | Hits:  98%/1582  
      🟩 GCC7               Pass: 100%/6   | Total: 34m 09s | Avg:  5m 41s | Max:  6m 23s | Hits:  97%/4983  
      🟩 GCC8               Pass: 100%/6   | Total: 34m 26s | Avg:  5m 44s | Max:  6m 46s | Hits:  97%/4983  
      🟩 GCC9               Pass: 100%/6   | Total: 34m 14s | Avg:  5m 42s | Max:  6m 34s | Hits:  97%/4983  
      🟩 GCC10              Pass: 100%/4   | Total: 25m 16s | Avg:  6m 19s | Max:  6m 51s | Hits:  97%/3476  
      🟩 GCC11              Pass: 100%/7   | Total: 47m 38s | Avg:  6m 48s | Max:  8m 38s | Hits:  97%/6069  
      🟩 GCC12              Pass: 100%/4   | Total: 24m 35s | Avg:  6m 08s | Max:  6m 42s | Hits:  97%/3468  
      🟩 GCC13              Pass: 100%/28  | Total:  6h 49m | Avg: 14m 37s | Max: 32m 46s | Hits:  98%/24276 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 22m 01s | Avg:  7m 20s | Max:  7m 58s | Hits:  98%/2385  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 15m 22s | Avg: 15m 22s | Max: 15m 22s | Hits:  97%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 24m 57s | Avg: 12m 28s | Max: 13m 03s | Hits:  97%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 36m 45s | Avg: 12m 15s | Max: 12m 30s | Hits:  97%/2127  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total: 10h 09m | Avg: 10m 20s | Max: 30m 57s | Hits:  98%/50671 
      🟩 GCC                Pass: 100%/63  | Total: 10h 21m | Avg:  9m 51s | Max: 32m 46s | Hits:  98%/53820 
      🟩 Intel              Pass: 100%/3   | Total: 22m 01s | Avg:  7m 20s | Max:  7m 58s | Hits:  98%/2385  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 17m | Avg: 12m 50s | Max: 15m 22s | Hits:  97%/4254  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total: 22h 10m | Avg: 10m 09s | Max: 32m 46s | Hits:  98%/111130
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 10h 54m | Avg:  6m 36s | Max: 15m 22s | Hits:  97%/83386 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 39m | Avg: 19m 55s | Max: 24m 54s | Hits:  99%/6936  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 22m | Avg: 17m 52s | Max: 22m 33s | Hits:  99%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 38m | Avg: 19m 45s | Max: 24m 29s | Hits:  99%/6936  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 34m | Avg: 26m 51s | Max: 32m 46s | Hits:  99%/6936  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 23m 32s | Avg:  7m 50s | Max:  8m 38s | Hits:  97%/2601  
      🟩 90a                Pass: 100%/4   | Total: 18m 01s | Avg:  4m 30s | Max:  4m 51s | Hits:  97%/3468  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  5h 13m | Avg:  9m 14s | Max: 23m 02s | Hits:  98%/29049 
      🟩 14                 Pass: 100%/37  | Total:  6h 20m | Avg: 10m 17s | Max: 30m 57s | Hits:  98%/31176 
      🟩 17                 Pass: 100%/36  | Total:  5h 59m | Avg:  9m 59s | Max: 32m 46s | Hits:  98%/30394 
      🟩 20                 Pass: 100%/24  | Total:  4h 35m | Avg: 11m 28s | Max: 29m 04s | Hits:  98%/20511 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 11h 10m | Avg: 5m 40s | Max: 33m 06s | Hits: 99%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 08m | Avg:  5m 32s | Max: 18m 53s | Hits:  99%/129492
      🟩 arm64              Pass: 100%/8   | Total:  1h 01m | Avg:  7m 41s | Max: 33m 06s | Hits:  90%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 57m 10s | Avg:  3m 48s | Max: 12m 31s | Hits:  99%/17660 
      🟩 11.8               Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 12.5               Pass: 100%/100 | Total: 10h 01m | Avg:  6m 01s | Max: 33m 06s | Hits:  99%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 30s | Avg:  3m 45s | Max:  3m 52s | Hits: 100%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total: 57m 10s | Avg:  3m 48s | Max: 12m 31s | Hits:  99%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  9h 54m | Avg:  6m 03s | Max: 33m 06s | Hits:  99%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 30s | Avg:  3m 45s | Max:  3m 52s | Hits: 100%/2354  
      🟩 nvcc               Pass: 100%/116 | Total: 11h 02m | Avg:  5m 42s | Max: 33m 06s | Hits:  99%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 23m 10s | Avg:  3m 51s | Max:  4m 20s | Hits: 100%/7062  
      🟩 Clang10            Pass: 100%/3   | Total: 13m 01s | Avg:  4m 20s | Max:  4m 51s | Hits: 100%/3531  
      🟩 Clang11            Pass: 100%/4   | Total: 14m 46s | Avg:  3m 41s | Max:  3m 51s | Hits: 100%/4708  
      🟩 Clang12            Pass: 100%/4   | Total: 14m 47s | Avg:  3m 41s | Max:  4m 06s | Hits: 100%/4708  
      🟩 Clang13            Pass: 100%/4   | Total: 14m 58s | Avg:  3m 44s | Max:  3m 55s | Hits: 100%/4708  
      🟩 Clang14            Pass: 100%/4   | Total: 15m 04s | Avg:  3m 46s | Max:  3m 52s | Hits: 100%/4708  
      🟩 Clang15            Pass: 100%/4   | Total: 15m 23s | Avg:  3m 50s | Max:  4m 02s | Hits: 100%/4708  
      🟩 Clang16            Pass: 100%/4   | Total: 15m 07s | Avg:  3m 46s | Max:  3m 58s | Hits: 100%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  2h 02m | Avg:  6m 46s | Max: 18m 02s | Hits: 100%/21186 
      🟩 GCC6               Pass: 100%/2   | Total:  6m 00s | Avg:  3m 00s | Max:  3m 06s | Hits:  99%/2354  
      🟩 GCC7               Pass: 100%/6   | Total: 19m 17s | Avg:  3m 12s | Max:  3m 41s | Hits:  99%/7068  
      🟩 GCC8               Pass: 100%/6   | Total: 20m 06s | Avg:  3m 21s | Max:  3m 45s | Hits:  99%/7068  
      🟩 GCC9               Pass: 100%/6   | Total: 20m 34s | Avg:  3m 25s | Max:  3m 44s | Hits:  99%/7068  
      🟩 GCC10              Pass: 100%/4   | Total: 15m 23s | Avg:  3m 50s | Max:  4m 08s | Hits:  99%/4712  
      🟩 GCC11              Pass: 100%/7   | Total: 26m 17s | Avg:  3m 45s | Max:  4m 02s | Hits:  99%/8246  
      🟩 GCC12              Pass: 100%/4   | Total: 15m 11s | Avg:  3m 47s | Max:  4m 09s | Hits:  99%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  2h 43m | Avg:  8m 10s | Max: 33m 06s | Hits:  95%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 13m 24s | Avg:  4m 28s | Max:  4m 34s | Hits: 100%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits:  98%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 22m 13s | Avg: 11m 06s | Max: 11m 08s | Hits:  98%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 27m | Avg: 14m 38s | Max: 18m 30s | Hits:  98%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  4h 08m | Avg:  4m 52s | Max: 18m 02s | Hits: 100%/60027 
      🟩 GCC                Pass: 100%/55  | Total:  4h 46m | Avg:  5m 12s | Max: 33m 06s | Hits:  98%/64788 
      🟩 Intel              Pass: 100%/3   | Total: 13m 24s | Avg:  4m 28s | Max:  4m 34s | Hits: 100%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 02m | Avg: 13m 37s | Max: 18m 30s | Hits:  98%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 11h 10m | Avg:  5m 40s | Max: 33m 06s | Hits:  99%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 25m | Avg:  4m 29s | Max: 33m 06s | Hits:  99%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 43m | Avg:  9m 24s | Max: 18m 30s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 01m | Avg: 15m 12s | Max: 18m 53s | Hits:  99%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 34s | Avg:  3m 51s | Max:  4m 02s | Hits:  99%/3534  
      🟩 90a                Pass: 100%/4   | Total: 13m 22s | Avg:  3m 20s | Max:  3m 28s | Hits:  99%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 15m | Avg:  4m 31s | Max: 18m 02s | Hits:  99%/35328 
      🟩 14                 Pass: 100%/34  | Total:  3h 08m | Avg:  5m 32s | Max: 17m 06s | Hits:  99%/40020 
      🟩 17                 Pass: 100%/33  | Total:  3h 03m | Avg:  5m 34s | Max: 18m 10s | Hits:  99%/38847 
      🟩 20                 Pass: 100%/21  | Total:  2h 42m | Avg:  7m 44s | Max: 33m 06s | Hits:  96%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 27s | Avg: 11m 27s | Max: 11m 27s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@@ -844,6 +845,12 @@ public:
// Pass-through bin transform operator
struct PassThruTransform
{
// GCC 14 rightfully warns that when a value-initialized array of this struct is copied using memcpy, uninitialized
// bytes may be accessed. To avoid this, we add a dummy member, so value initialization actually initializes the memory.
#if defined(_CCCL_COMPILER_GCC) && __GNUC__ == 14
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be >= rather than ==.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could. But I would like to run into this issue again with GCC 15, since there is a chance they fix their bug :)

@@ -414,6 +414,7 @@ struct dispatch_histogram
d_output_histograms, d_output_histograms + NUM_ACTIVE_CHANNELS, d_output_histograms_wrapper.begin());
::cuda::std::copy(
typedAllocations, typedAllocations + NUM_ACTIVE_CHANNELS, d_privatized_histograms_wrapper.begin());
// TODO(bgruber): we can probably skip copying the function objects when they are empty
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this would be a nicer fix, but as a hotfix what you have here is fine.

@bernhardmgruber bernhardmgruber merged commit 02378eb into NVIDIA:main Aug 1, 2024
269 checks passed
@bernhardmgruber bernhardmgruber deleted the fix_gcc14_warn branch August 1, 2024 17:02
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
```
In function bool cuda::std::__4::__dispatch_memmove(_Up*, _Tp*, size_t)
...
error: *(unsigned char*)(&privatized_decode_op[0]) may be used uninitialized [-Werror=maybe-uninitialized]
...
*(unsigned char*)(&privatized_decode_op[0]) was declared here
 1528 |       PrivatizedDecodeOpT privatized_decode_op[NUM_ACTIVE_CHANNELS]{};
```
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
```
In function bool cuda::std::__4::__dispatch_memmove(_Up*, _Tp*, size_t)
...
error: *(unsigned char*)(&privatized_decode_op[0]) may be used uninitialized [-Werror=maybe-uninitialized]
...
*(unsigned char*)(&privatized_decode_op[0]) was declared here
 1528 |       PrivatizedDecodeOpT privatized_decode_op[NUM_ACTIVE_CHANNELS]{};
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants