You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As a CUDA developer using libcu++, I want to be able to use architecture dependent features of libcudacxx in my CUDA application. For any given libcudacxx header and feature, I need to be able to do the following:
I need to be able to compile this file with any set of architectures (-gencode arch=compute_XX,code=sm_XX) and for it to be able to compile and link successfully so long as I am always careful to use an architecture dependent feature in an appropriately guarded code path, whether using NV_IF_TARGET or __CUDA_ARCH__.
However, this does not work universally today. For example, the following fails to compile when compiled with -gencode arch=compute_52,code=sm_52 -gencode arch=compute_70,code=sm_70
This is because the cuda/atomic header will unconditionally error any time it is included in a TU that compiles for an architecture less than sm60, even if the feature is never used in code paths for the unsupported architecture.
jrhemstad
changed the title
[FEA]: Improve usability of architecture specific features in libcudacxx
[EPIC]: Improve usability of architecture specific features in libcudacxx
Jul 22, 2024
Is this a duplicate?
Area
libcu++
Is your feature request related to a problem? Please describe.
As a CUDA developer using libcu++, I want to be able to use architecture dependent features of libcudacxx in my CUDA application. For any given libcudacxx header and feature, I need to be able to do the following:
I need to be able to compile this file with any set of architectures (
-gencode arch=compute_XX,code=sm_XX
) and for it to be able to compile and link successfully so long as I am always careful to use an architecture dependent feature in an appropriately guarded code path, whether usingNV_IF_TARGET
or__CUDA_ARCH__
.However, this does not work universally today. For example, the following fails to compile when compiled with
-gencode arch=compute_52,code=sm_52 -gencode arch=compute_70,code=sm_70
https://godbolt.org/z/ddMaW65Ej
This is because the
cuda/atomic
header will unconditionally error any time it is included in a TU that compiles for an architecture less thansm60
, even if the feature is never used in code paths for the unsupported architecture.A similar problem exists with
cuda/barrier
: https://godbolt.org/z/aEjsMT5YKDescribe the solution you'd like
I should be able to do the following with all libcu++ headers and features:
Tasks
Describe alternatives you've considered
If libcu++ doesn't do this, then I am forced to use lower level things like
atomicAdd()
or inline PTX.Additional context
Related issues:
#997
#1082
#624
The text was updated successfully, but these errors were encountered: