Skip to content
Discussion options

You must be logged in to vote

The Warp functions are compiled into the kernel, so you won't get granularity finer than that with Nsight Systems (afaik). If you are trying to answer the question why is this 1 second kernel taking so long then your question comes at a great time since I just added a section to the docs talking about kernel-level profiling with Nsight Compute: https://nvidia.github.io/warp/profiling.html#nsight-compute-profiling

Replies: 3 comments 18 replies

Comment options

You must be logged in to vote
12 replies
@shi-eric
Comment options

@cadop
Comment options

@shi-eric
Comment options

@cadop
Comment options

@shi-eric
Comment options

Answer selected by cadop
Comment options

You must be logged in to vote
5 replies
@shi-eric
Comment options

@cadop
Comment options

@cadop
Comment options

@shi-eric
Comment options

@shi-eric
Comment options

Comment options

You must be logged in to vote
1 reply
@shi-eric
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants