Skip to content

Quickstart

FindHao edited this page Aug 31, 2025 · 7 revisions

Minimal steps to build CUTracer, attach it to an app, and collect instruction histograms.

Prerequisites 📦

  • CUDA toolkit installed and nvcc in PATH
  • A C++ compiler (like g++)
  • Git (for cloning dependencies)

1. Install Dependencies 🛠️

First, run the script to download and set up NVBit.

cd /home/findhao/d/CUTracer
./install_third_party.sh

2. Build CUTracer 🧱

make -j$(nproc)
ls lib/cutracer.so

Note: The make command will build for all GPU architectures (-arch=all) by default. For a faster build, you can target a specific architecture, e.g., make ARCH=sm_90.

3. Run a CUDA app with CUTracer ▶️

Attach CUTracer to your application using environment variables. This example collects a lightweight instruction histogram.

CUDA_INJECTION64_PATH=/home/findhao/d/CUTracer/lib/cutracer.so \
CUTRACER_ANALYSIS=proton_instr_histogram \
KERNEL_FILTERS=add_kernel \
./your_app

Outputs (in your current working directory):

  • cutracer_main_YYYYMMDD_HHMMSS.log (main tool log)
  • kernel_<hash>_iter<idx>_<name>_hist.csv (per-kernel instruction histogram)

4. End-to-end Example (Triton Proton Test) 🔁

This demonstrates the full two-pass workflow for calculating IPC. See also: Post-processing: IPC Merge.

cd /home/findhao/d/CUTracer/tests/proton_tests

# 1) Collect instruction histogram using CUTracer (filtered to add_kernel)
CUDA_INJECTION64_PATH=/home/findhao/d/CUTracer/lib/cutracer.so \
CUTRACER_ANALYSIS=proton_instr_histogram \
KERNEL_FILTERS=add_kernel \
python ./vector-add-instrumented.py

# 2) Generate a clean Chrome trace without CUTracer for accurate timing
python ./vector-add-instrumented.py

# 3) Parse and join traces into an IPC CSV
python /home/findhao/d/CUTracer/scripts/parse_instr_hist_trace.py \
  --chrome-trace ./vector.chrome_trace \
  --cutracer-trace ./kernel_*_add_kernel_hist.csv \
  --cutracer-log ./cutracer_main_*.log \
  --output vectoradd_ipc.csv

Next: Analyses and Post-processing: IPC Merge.

Clone this wiki locally