Skip to content

Conversation

@FindHao
Copy link
Member

@FindHao FindHao commented Oct 28, 2025

Overview

This PR adds support for parsing MLIR callsite locations in Triton IR (TTIR/TTGIR), enabling tritonparse to capture and preserve call stack information for inlined functions.

Problem

Previously, tritonparse could not parse callsite location definitions like:

#loc220 = loc(callsite(#loc57 at #loc190))

This caused incomplete source code mappings, losing critical information about:

  • Function call chains (which function called which)
  • Inlining context (how nested functions are represented)
  • Complete source attribution (full path through code to reach an operation)

Solution

Implemented a hybrid approach that:

  1. Parses callsite definitions and inherits location from the callee (code being called)
  2. Preserves caller references as metadata for call stack traversal
  3. Propagates callsite information through the entire mapping pipeline

Key Design Decisions

  • Callee as primary location: Maps to the actual code being executed (most relevant for debugging)
  • Metadata preservation: Stores references (loc IDs) rather than fully expanding call stacks
  • Backward compatible: Adds optional fields without breaking existing tools
  • Extensible: Future enhancements can traverse and expand call chains on demand

Implementation Details

1. Added Callsite Pattern (ir_parser.py)

CALLSITE_PATTERN = re.compile(
    r"#loc(\d+)\s*=\s*loc\(\s*callsite\(\s*#loc(\d*)\s+at\s+#loc(\d*)\s*\)\s*\)"
)

2. Enhanced extract_loc_definitions() (ir_parser.py)

  • Collects all callsite definitions during IR parsing
  • Resolves callsite references by inheriting location info from callee
  • Stores callsite metadata: is_callsite, callsite_callee, callsite_caller
  • Validates references with warning messages for undefined locs

3. Updated generate_source_mappings() (trace_processor.py)

  • Propagates callsite metadata from loc_defs to final mappings
  • Enables downstream tools to identify and traverse call chains

Data Structure Example

For a nested callsite like:

#loc7 = loc("file.py":1091:8)
#loc57 = loc("file.py":421:16)
#loc58 = loc("file.py":853:16)
#loc190 = loc(callsite(#loc58 at #loc7))
#loc220 = loc(callsite(#loc57 at #loc190))
%0 = tt.load %ptr loc(#loc220)

The resulting mapping for line 131 (where tt.load is):

{
  "file": "file.py",
  "line": 421,              // From callee (loc57) - actual code executing
  "column": 16,
  "ttir_line": 131,
  "is_callsite": true,
  "callsite_callee": "57",  // Reference to called code
  "callsite_caller": "190"  // Reference to caller (can traverse chain)
}

Call chain represented: _ragged_hstu_attn_fwd (1091:8) → _ragged_hstu_attn_fwd_compute (853:16) → _ragged_hstu_attn_fwd_one_block (421:16) ← executing here

Testing

Added comprehensive unit tests in tests/test_tritonparse.py:

  • TestTritonparseCPU::test_callsite_parsing
  • Validates nested callsite parsing
  • Verifies metadata propagation to mappings
  • Tests both simple and nested callsite scenarios

All tests pass ✅

Impact

Benefits

  • ✅ Complete source mapping for inlined functions
  • ✅ Preserves call stack information for debugging
  • ✅ Enables future call chain visualization
  • ✅ Backward compatible with existing tools

No Breaking Changes

  • Only adds optional fields to existing mappings
  • Existing code that doesn't check for callsite fields continues to work
  • No changes to public APIs

Files Changed

  1. tritonparse/ir_parser.py - Added callsite parsing logic
  2. tritonparse/trace_processor.py - Propagate callsite metadata
  3. tests/test_tritonparse.py - Added unit tests
  4. CALLSITE_IMPLEMENTATION.md - Detailed implementation documentation

Future Work

Potential enhancements (not in this PR):

  1. Automatic call stack expansion utility functions
  2. Call stack caching for performance optimization
  3. Frontend UI support for call chain visualization
  4. Cycle detection for complex callsite graphs

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 28, 2025
Implement parsing for MLIR callsite locations (e.g., `loc(callsite(#loc57 at #loc190))`)
to preserve function call chain information for inlined functions.

Changes:
- Add CALLSITE_PATTERN regex to recognize callsite definitions
- Extend extract_loc_definitions() to collect and resolve callsite references
- Update generate_source_mappings() to propagate callsite metadata
- Add comprehensive unit tests in TestTritonparseCPU::test_callsite_parsing

Callsite locations inherit file/line/column from the callee (actual code executing)
while preserving caller references as metadata for call stack traversal. This enables
complete source mapping for inlined functions without breaking existing tools.

Test Plan:
python -m pytest tests/test_tritonparse.py::TestTritonparseCPU::test_callsite_parsing -v
@FindHao FindHao force-pushed the findhao/nested_callsite_loc_info branch from c7585a5 to 90bd668 Compare October 28, 2025 17:14
@FindHao FindHao marked this pull request as ready for review October 28, 2025 17:16
@meta-codesync
Copy link

meta-codesync bot commented Oct 28, 2025

@FindHao has imported this pull request. If you are a Meta employee, you can view this in D85681542.

@meta-codesync
Copy link

meta-codesync bot commented Oct 29, 2025

@FindHao merged this pull request in ad41299.

@FindHao FindHao deleted the findhao/nested_callsite_loc_info branch October 30, 2025 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants