-
I'm a novice at the profiling, so with that in mind; I am using nvtx by setting nvidia nsights to run my virtual environment python.exe calling my script. I use However, I can't see any of my Maybe I am using the wrong approach. Essentially I have a kernel that takes 1 second and another that takes 50ms. The 1 second one uses a bunch of functions, but none of them are showing. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 18 replies
-
The Warp functions are compiled into the kernel, so you won't get granularity finer than that with Nsight Systems (afaik). If you are trying to answer the question why is this 1 second kernel taking so long then your question comes at a great time since I just added a section to the docs talking about kernel-level profiling with Nsight Compute: https://nvidia.github.io/warp/profiling.html#nsight-compute-profiling |
Beta Was this translation helpful? Give feedback.
-
I'm not able to get that output. Am I missing an entire step? Basically all i've done is:
Then part 1 of debugging was: I couldn't see the kernel functions, so posted this question. Then; Are there some debug environments I needed to setup?
|
Beta Was this translation helpful? Give feedback.
-
Thanks @shi-eric ! I got it working. If others come by this, might help to read the thread. My thoughts on what might help the docs, or others:
If anything else comes to mind ill followup here, but hopefully some of these comments will help someone |
Beta Was this translation helpful? Give feedback.
The Warp functions are compiled into the kernel, so you won't get granularity finer than that with Nsight Systems (afaik). If you are trying to answer the question why is this 1 second kernel taking so long then your question comes at a great time since I just added a section to the docs talking about kernel-level profiling with Nsight Compute: https://nvidia.github.io/warp/profiling.html#nsight-compute-profiling