-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration
FindHao edited this page Aug 31, 2025
·
4 revisions
CUTracer is configured by environment variables.
-
CUTRACER_INSTRUMENT: comma-separated instrumentation types to enable-
opcode_only(lightest) reg_tracemem_trace
-
-
CUTRACER_ANALYSIS: comma-separated analysis types to enable-
proton_instr_histogram(auto-enablesopcode_onlyif not set) -
deadlock_detection(auto-enablesreg_trace)
-
-
KERNEL_FILTERS: comma-separated substrings to match kernel names (mangled or unmangled)- Example:
KERNEL_FILTERS=add,_Z2_gemm,reduce
- Example:
-
INSTR_BEGIN,INSTR_END: static instruction index interval gate during instrumentation- Example:
INSTR_BEGIN=0 INSTR_END=1000
- Example:
-
TOOL_VERBOSE: verbosity of tool logs (0/1/2)
Other environment considerations:
-
CUDA_MANAGED_FORCE_DEVICE_ALLOC=1is set by the tool to simplify channel memory handling. - CUDA/NVBit/driver versions must be compatible with your GPU.
Notes 📝:
- When
proton_instr_histogramis enabled,opcode_onlyis forced internally to minimize overhead and ensure required data is available. - When
deadlock_detectionis enabled,reg_traceis forced internally because loop detection relies on PC and opcode correlation per warp. -
KERNEL_FILTERSuses substring matching against both unmangled and mangled names; any match enables instrumentation for that function and related device functions.