-
Notifications
You must be signed in to change notification settings - Fork 285
Implement cudax::demangle
#4996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
If we added this to libcu++ directly (maybe as an internal function) we could kill template <typename T>
inline const std::string type_name = demangle(typeid(T).name()); |
Hey, why not! I just a bit unsure whether we want to use However, if we keep it internal, there is no such a problem :) I would just change the implementation to get rid of the We may also add the link of the library to the cccl cmake module file, so the user using cmake needn't to do it himself |
|
/ok to test a8edcd2 |
🟨 CI finished in 9m 52s: Pass: 13%/30 | Total: 2h 02m | Avg: 4m 05s | Max: 8m 07s
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| CCCL Packaging | |
| libcu++ | |
| CUB | |
| Thrust | |
| +/- | CUDA Experimental |
| stdpar | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| +/- | CCCL Packaging |
| libcu++ | |
| CUB | |
| Thrust | |
| +/- | CUDA Experimental |
| stdpar | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
🏃 Runner counts (total jobs: 30)
| # | Runner |
|---|---|
| 17 | linux-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtx2080-latest-1 |
| 4 | linux-arm64-cpu16 |
| 2 | windows-amd64-cpu16 |
| 1 | linux-amd64-gpu-h100-latest-1 |
|
@wmaxey the devcontainers are missing the Update: I've made a PR updating the devcontainers rapidsai/devcontainers#530 |
|
Are we adding a dependency to cccl that may be problematic for some users? The issue with the devcontainers gives me pause. |
|
I see cu++filt in the image: $ docker run --rm rapidsai/devcontainers:25.08-cpp-gcc13-cuda12.9 apt-cache policy cuda-cuxxfilt-12-9
cuda-cuxxfilt-12-9:
Installed: 12.9.19-1
Candidate: 12.9.19-1
Version table:
*** 12.9.19-1 100
100 /var/lib/dpkg/status
$ docker run --rm rapidsai/devcontainers:25.08-cpp-gcc13-cuda12.9 dpkg -L cuda-cuxxfilt-12-9
/.
/usr
/usr/local
/usr/local/cuda-12.9
/usr/local/cuda-12.9/bin
/usr/local/cuda-12.9/bin/cu++filt
/usr/local/cuda-12.9/targets
/usr/local/cuda-12.9/targets/x86_64-linux
/usr/local/cuda-12.9/targets/x86_64-linux/include
/usr/local/cuda-12.9/targets/x86_64-linux/include/nv_decode.h
/usr/local/cuda-12.9/targets/x86_64-linux/lib
/usr/local/cuda-12.9/targets/x86_64-linux/lib/libcufilt.a
/usr/share
/usr/share/doc
/usr/share/doc/cuda-cuxxfilt-12-9
/usr/share/doc/cuda-cuxxfilt-12-9/changelog.Debian.gz
/usr/share/doc/cuda-cuxxfilt-12-9/copyright
/usr/local/cuda-12.9/include
/usr/local/cuda-12.9/lib64 |
That is definitely a valid point. But on the other hand it will not cause problems unless the As @trxcllnt pointed out, the The idea to provide this function came into my mind after reading #4849. If implemented, it will bring other dependencies to CCCL I believe. The motivation is to provide e. g. I understand the concerns and would like to know other's opinions :) Edit: I don't like that there is no |
|
Have you looked at using |
|
/ok to test 553fb6b |
🟨 CI finished in 9m 34s: Pass: 13%/30 | Total: 1h 52m | Avg: 3m 45s | Max: 7m 44s
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| CCCL Packaging | |
| libcu++ | |
| CUB | |
| Thrust | |
| +/- | CUDA Experimental |
| stdpar | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| +/- | CCCL Packaging |
| libcu++ | |
| CUB | |
| Thrust | |
| +/- | CUDA Experimental |
| stdpar | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
🏃 Runner counts (total jobs: 30)
| # | Runner |
|---|---|
| 17 | linux-amd64-cpu16 |
| 6 | linux-amd64-gpu-rtx2080-latest-1 |
| 4 | linux-arm64-cpu16 |
| 2 | windows-amd64-cpu16 |
| 1 | linux-amd64-gpu-h100-latest-1 |
Thanks for the suggestion! Actually, I have. The idea is to provide a simple The A preview version of a new tool, cu++filt, is included in this release. NVCC produces mangled names, appearing in PTX files, which do not strictly follow the mangling conventions of the Itanium ABI--and are thus not properly demangled by standard tools such as binutils' c++filt. Specifically, this is true for PTX function parameters. The new cu++filt utility will demangle all of these correctly. As this is a preview version of the utility, feedback is invited. For more information, see cu++filt. I am not sure whether it is still the case, but I would much prefer to use |
|
/ok to test 1fb0c8b |
🟨 CI finished in 2h 25m: Pass: 96%/183 | Total: 4d 00h | Avg: 31m 38s | Max: 1h 36m | Hits: 64%/277261
|
| Project | |
|---|---|
| CCCL Infrastructure | |
| CCCL Packaging | |
| +/- | libcu++ |
| CUB | |
| Thrust | |
| CUDA Experimental | |
| stdpar | |
| python | |
| CCCL C Parallel Library | |
| Catch2Helper |
Modifications in project or dependencies?
| Project | |
|---|---|
| CCCL Infrastructure | |
| +/- | CCCL Packaging |
| +/- | libcu++ |
| +/- | CUB |
| +/- | Thrust |
| +/- | CUDA Experimental |
| +/- | stdpar |
| +/- | python |
| +/- | CCCL C Parallel Library |
| +/- | Catch2Helper |
🏃 Runner counts (total jobs: 183)
| # | Runner |
|---|---|
| 125 | linux-amd64-cpu16 |
| 15 | windows-amd64-cpu16 |
| 12 | linux-arm64-cpu16 |
| 12 | linux-amd64-gpu-rtxa6000-latest-1 |
| 11 | linux-amd64-gpu-rtx2080-latest-1 |
| 5 | linux-amd64-gpu-h100-latest-1 |
| 3 | linux-amd64-gpu-rtx4090-latest-1 |
|
/ok to test dc3e643 |
This comment has been minimized.
This comment has been minimized.
b4a9bf5 to
b549a33
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
cb0f9da to
baf8f0f
Compare
This comment has been minimized.
This comment has been minimized.
baf8f0f to
59afcd4
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
33bde4a to
c05e626
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
9d2c6e8 to
fdf7994
Compare
This comment has been minimized.
This comment has been minimized.
|
I'd like to see us add a failure mode test that emulates an environment where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake approved pending small addition.
Added :) |
| message(FATAL_ERROR "cudax: cu++filt library (libcufilt.a) not found.") | ||
| endif() | ||
|
|
||
| foreach(cn_target IN LISTS cudax_TARGETS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: replace cn_target with cudax_target in this new code. I'm making this change for existing code in #6346 -- cudax used to be called "cuda next" and this prefix is a legacy artifact that no longer makes sense.
| set(test_target ${config_prefix}.test.binutils_demangle_no_nvdecode_fail) | ||
| add_test(NAME ${test_target} | ||
| COMMAND ${CMAKE_CTEST_COMMAND} | ||
| --build-and-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--build-and-test isn't what we want here, it's used for configuring/building a project, not an individual translation unit.
Check out what we do for CUB "fail" tests:
https://github.com/NVIDIA/cccl/blob/main/cub/test/CMakeLists.txt#L174-L199
Basically:
- Add the test executable as normal.
- Exclude from the
allbuild target:
set_target_properties(${test_target} PROPERTIES EXCLUDE_FROM_ALL true
EXCLUDE_FROM_DEFAULT_BUILD true)
- Add a test that explicitly invokes the build step for the excluded target:
add_test(NAME ${test_target}
COMMAND ${CMAKE_COMMAND} --build "${CMAKE_BINARY_DIR}"
--target ${test_target}
--config $<CONFIGURATION>)
- Keep the
WILL_FAILproperty on the test:
set_tests_properties(${test_target} PROPERTIES WILL_FAIL true)
...or for more robustness, instead check for output that confirms the expected failure mode, if a cross-platform regex exists for this:
set_tests_properties(${test_target} PROPERTIES PASS_REGULAR_EXPRESSION "<cross-platform error regex>")
(Remove the WILL_FAIL prop if the regex approach is used)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the regex approach is preferred because the test is currently passing without actually testing what it's supposed to:
Start 19: cudax.cpp17.test.binutils_demangle_no_nvdecode_fail
19: Test command: /usr/bin/ctest "--build-and-test" "/home/coder/cccl/cudax/test/binutils/binutils_demangle_no_nvdecode_fail" "/home/coder/cccl/build/preset-latest/cudax/test/binutils/binutils_demangle_no_nvdecode_fail" "--build-generator" "Ninja" "--test-command" "/usr/bin/ctest"
19: Working Directory: /home/coder/cccl/build/preset-latest/cudax/test/binutils
19: Test timeout computed to be: 1500
19: Internal cmake changing into directory: /home/coder/cccl/build/preset-latest/cudax/test/binutils/binutils_demangle_no_nvdecode_fail
19: ======== CMake output ======
19: CMake Error: The source directory "/home/coder/cccl/cudax/test/binutils/binutils_demangle_no_nvdecode_fail" does not exist.
19: Specify --help for usage, or press the help button on the CMake GUI.
19: ======== End CMake output ======
19: Error: cmake execution failed
1/1 Test #19: cudax.cpp17.test.binutils_demangle_no_nvdecode_fail ... Passed 0.01 sec
The following tests passed:
cudax.cpp17.test.binutils_demangle_no_nvdecode_fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #6434. Repeat the steps for creating the executable target like you did for (2), but pass it to cccl_add_xfail_compile_target_test(${target_name} [REGEX <regex>) instead of doing add_test / set_test_properties.
🥳 CI Workflow Results🟩 Finished in 37m 40s: Pass: 100%/42 | Total: 2h 11m | Max: 11m 00s | Hits: 99%/21477See results here. |
This PR introducesdemanglefunction tocuda::experimental. It uses__cu_demanglefunction fromlibcufilt.alibrary which is required to be linked by the user when using this function.This PR introduces
cuda::demanglefunction for demangling CUDA C++ symbols. It is a function available through<cuda/__binutils_>module which is currently internal. The demangling is implemented by__cu_demanglefunction fromlibcufilt.alibrary that comes with CUDA Toolkit since 11.2.