Skip to content

Conversation

rdspring1
Copy link
Collaborator

This PR adds support for CUPTI profiling to direct bindings. Instead of calling two functions fd.execute(profile=True); data = fd.profile, direct bindings uses a context manager named PythonProfiler. This approach allows profiling any CPP Executor or the python FusionDefinition without creating a separate execute function for profiling.

Example

with PythonProfiler(auto_scheduled) as prof:
    fd.execute(inputs)
# After context manager exists, prof.profile contains the `FusionProfile` data

@rdspring1 rdspring1 added Python API Issues related to the Python API Direct Bindings Python extension with direct mapping to NvFuser CPP objects. labels Oct 4, 2025
@rdspring1
Copy link
Collaborator Author

!test

Copy link

github-actions bot commented Oct 4, 2025

Description

  • Add CUPTI-based profiling via PythonProfiler context manager

  • Expose FusionProfile and KernelProfile with detailed metrics

  • Implement Python bindings for profiling functionality

  • Add comprehensive tests for auto/user scheduled profiling


Changes walkthrough 📝

Relevant files
Bug fix
fusion_profiler.cpp
Fix typos and relax correlation ID check                                 

csrc/fusion_profiler.cpp

  • Fix typo in comment: "Activty" → "Activity"
  • Fix typo in comment: "CUPT docs" → "CUPTI docs"
  • Remove NVF_CHECK on correlation ID reuse, allowing overwrites
  • +2/-7     
    Enhancement
    bindings.cpp
    Register profiling bindings                                                           

    python/python_direct/bindings.cpp

    • Add bindProfile(nvfuser) to initialize profiling bindings
    +1/-0     
    profile.cpp
    Add Python bindings for profiling                                               

    python/python_direct/profile.cpp

  • Implement bindFusionProfile to expose KernelProfile and FusionProfile
    to Python
  • Define PythonProfiler class with context manager support (__enter__,
    __exit__)
  • Expose profiling properties like name, time, bandwidth, scheduler,
    etc.
  • Add __repr__ for KernelProfile and FusionProfile
  • +256/-0 
    fusion_profiler.h
    Export profiler classes via NVF_API                                           

    csrc/fusion_profiler.h

  • Mark KernelProfile, FusionProfile, SegmentProfiler, and FusionProfiler
    methods with NVF_API for export
  • Update struct declarations to be accessible from shared library
  • +9/-9     
    bindings.h
    Declare profiling bindings function                                           

    python/python_direct/bindings.h

  • Declare bindProfile(py::module& nvfuser) to register profiling
    bindings
  • +3/-0     
    Tests
    test_python_direct.py
    Add direct Python profiling tests                                               

    tests/python/direct/test_python_direct.py

  • Import PythonProfiler in test setup
  • Add three new tests: auto-scheduled, user-scheduled, and non-codegen
    kernels
  • Validate profile output including segments, kernel count, and
    scheduler type
  • +90/-1   
    test_python_frontend.py
    Re-enable legacy profiler tests                                                   

    tests/python/test_python_frontend.py

  • Remove skip markers from three profiler-related tests
  • Enable test_fusion_profiler, test_fusion_profiler_user_schedule, and
    test_fusion_profiler_with_noncodegen_kernels
  • +0/-3     
    Configuration changes
    CMakeLists.txt
    Include profile.cpp in build                                                         

    CMakeLists.txt

    • Add profile.cpp to Python direct bindings source list
    +1/-0     

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review

    Missing Return Statement

    The method get_fusion_profile in the PythonProfiler class has control paths that do not explicitly return a value, which may lead to undefined behavior.

    const FusionProfile& get_fusion_profile() {
      const FusionProfile& profile = FusionProfiler::profile();
      NVF_ERROR(
          profile.fusion_id != -1,
          "Something went wrong with Fusion Profiling as an illegal fusion_id "
          "was returned!")
      NVF_ERROR(
          profile.segments > 0,
          "Something went wrong with Fusion Profiling as no kernel segments were "
          "profiled!")
      return profile;
    }
    Duplicate Definition

    The __repr__ method for KernelProfile is defined twice in the Python bindings, which may lead to unexpected behavior or override issues.

    py::class_<FusionProfile> fusion_prof(nvfuser, "FusionProfile");
    kernel_prof.def("__repr__", [](KernelProfile& self) {
      std::stringstream ss;
      ss << self;
      return ss.str();
    });
    Comment Typo

    A typo in a comment may affect code readability and maintainability.

    // by CUPTI and you can find their signatured definitions in the CUPTI docs.

    uint32_t seg_id,
    uint32_t corr_id) {
    FusionProfiler& fp = get();
    NVF_CHECK(
    Copy link
    Collaborator Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    @Priya2698 This is the assertion triggered by test_fusion_profiler when I remove pytest.mark.skipif. I'm hitting this check for a correlation_id not tied to the correlationId for the KernelProfile.

    Copy link
    Collaborator

    @Priya2698 Priya2698 Oct 6, 2025

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    See #4685

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Direct Bindings Python extension with direct mapping to NvFuser CPP objects. Python API Issues related to the Python API
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants