Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Compilation error when both NVTX v2 and CUB are included #2300

Open
1 task done
dkolsen-pgi opened this issue Aug 27, 2024 · 3 comments
Open
1 task done

[BUG]: Compilation error when both NVTX v2 and CUB are included #2300

dkolsen-pgi opened this issue Aug 27, 2024 · 3 comments
Assignees
Labels
bug Something isn't working right.

Comments

@dkolsen-pgi
Copy link
Collaborator

Is this a duplicate?

Type of Bug

Compile-time Error

Component

CUB

Describe the bug

As of CUDA 12.6, CUB includes NVTX version 3 headers. If a source file includes an earlier version of NVTX and then includes a CUB header, then the NVTX version 3 header will #error out with "Trying to #include NVTX version 3 in a source file where an older NVTX version has already been included." The CUDA Toolkit contains headers for NVTX version 2, so it is quite easy for users to hit this compilation error. (See the steps to reproduce for a trivial example.)

How to Reproduce

$ cat nvtx.cu
#include <nvToolsExt.h>
#include <thrust/reduce.h>
int main() { }
$ /proj/cuda/12.6/Linux_x86_64/bin/nvcc nvtx.cu
In file included from /proj/cuda/12.6.0.560/Linux_x86_64/targets/x86_64-linux/include/cub/detail/nvtx3.hpp:635,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/cub/detail/nvtx.cuh:49,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/cub/device/device_reduce.cuh:46,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/reduce.h:43,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/thrust/system/detail/adl/reduce.h:50,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/thrust/detail/reduce.inl:31,
                 from /proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/thrust/reduce.h:763,
                 from nvtx.cu:2:
/proj/cuda/12.6/Linux_x86_64/bin/../targets/x86_64-linux/include/nvtx3/nvToolsExt.h:179:2: error: #error "Trying to #include NVTX version 3 in a source file where an older NVTX version has already been included.  If you are not directly using NVTX (the NVIDIA Tools Extension library), you are getting this error because libraries you are using have included different versions of NVTX.  Suggested solutions are: (1) reorder #includes so the newest NVTX version is included first, (2) avoid using the conflicting libraries in the same .c/.cpp file, or (3) update the library using the older NVTX version to use the newer version instead."
  179 | #error "Trying to #include NVTX version 3 in a source file where an older NVTX version has already been included.  If you are not directly using NVTX (the NVIDIA Tools Extension library), you are getting this error because libraries you are using have included different versions of NVTX.  Suggested solutions are: (1) reorder #includes so the newest NVTX version is included first, (2) avoid using the conflicting libraries in the same .c/.cpp file, or (3) update the library using the older NVTX version to use the newer version instead."
      |  ^~~~~

Expected behavior

CUB should not fail with a compilation error if <nvToolsExt.h> has been included before CUB is included.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0

@dkolsen-pgi dkolsen-pgi added the bug Something isn't working right. label Aug 27, 2024
@bernhardmgruber bernhardmgruber self-assigned this Aug 27, 2024
@jrhemstad
Copy link
Collaborator

I don't really see what we can do above and beyond what we already provide. The nvToolsExt.h header is deprecated and slated to be removed from the CTK.

A person's options are:

  • Don't include nvToolsExt.h
  • Disable CCCL NVTX ranges with -DCCCL_DISABLE_NVTX

@dkolsen-pgi dkolsen-pgi changed the title [BUG]: [BUG]: Compilation error when both NVTX v2 and CUB are included Aug 28, 2024
@dkolsen-pgi
Copy link
Collaborator Author

CUB can deal with this by adding something like this to the CUB header:

#if defined(NVTX_VERSION) && NVTX_VERSION < 3
#define CCCL_DISABLE_NVTX 1
#endif

@bernhardmgruber
Copy link
Contributor

The situation is actually more complex. While we can workaround your example by disabling NVTX3 inside CUB:

#include <nvToolsExt.h>
// NVTX_VERSION is defined to 2
#include <thrust/reduce.h> // CUB sees NVTX_VERSION == 2 and disables NVTX3
int main() { }

We cannot workaround a swapped order of includes:

#include <thrust/reduce.h> // CUB sees no NVTX_VERSION and includes NVTX3
#include <nvToolsExt.h> // clashes with NVTX3 symbols
int main() { }

See: https://cuda.godbolt.org/z/PTq64P4dr

Given we can only provide half a workaround, I would rather not have it at all. The final order of includes can be unpredictable sometimes, so users would only be surprised that they can use NVTX2 in some TUs and not in others. Or a refactoring of their headers would suddently break, because NVTX2 and NVTX3 include order changed. That sounds very brittle. I think users should rather take note of mixing APIs deterministically and start to migrate away from NVTX2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right.
Projects
Status: Blocked
Development

No branches or pull requests

3 participants