-
Notifications
You must be signed in to change notification settings - Fork 2
Quickstart
FindHao edited this page Aug 31, 2025
·
7 revisions
Minimal steps to build CUTracer, attach it to an app, and collect instruction histograms.
- CUDA toolkit installed and
nvccin PATH - A C++ compiler (like g++)
- Git (for cloning dependencies)
First, run the script to download and set up NVBit.
cd /home/findhao/d/CUTracer
./install_third_party.shmake -j$(nproc)
ls lib/cutracer.soNote: The make command will build for all GPU architectures (-arch=all) by default. For a faster build, you can target a specific architecture, e.g., make ARCH=sm_90.
Attach CUTracer to your application using environment variables. This example collects a lightweight instruction histogram.
CUDA_INJECTION64_PATH=/home/findhao/d/CUTracer/lib/cutracer.so \
CUTRACER_ANALYSIS=proton_instr_histogram \
KERNEL_FILTERS=add_kernel \
./your_appOutputs (in your current working directory):
-
cutracer_main_YYYYMMDD_HHMMSS.log(main tool log) -
kernel_<hash>_iter<idx>_<name>_hist.csv(per-kernel instruction histogram)
This demonstrates the full two-pass workflow for calculating IPC. See also: Post-processing: IPC Merge.
cd /home/findhao/d/CUTracer/tests/proton_tests
# 1) Collect instruction histogram using CUTracer (filtered to add_kernel)
CUDA_INJECTION64_PATH=/home/findhao/d/CUTracer/lib/cutracer.so \
CUTRACER_ANALYSIS=proton_instr_histogram \
KERNEL_FILTERS=add_kernel \
python ./vector-add-instrumented.py
# 2) Generate a clean Chrome trace without CUTracer for accurate timing
python ./vector-add-instrumented.py
# 3) Parse and join traces into an IPC CSV
python /home/findhao/d/CUTracer/scripts/parse_instr_hist_trace.py \
--chrome-trace ./vector.chrome_trace \
--cutracer-trace ./kernel_*_add_kernel_hist.csv \
--cutracer-log ./cutracer_main_*.log \
--output vectoradd_ipc.csvNext: Analyses and Post-processing: IPC Merge.