Skip to content

Feature Request: Support accessing prior results when profiling multiple functions in a single script #12

@ConvolutedDog

Description

@ConvolutedDog

Problem

Nsight's profiling system currently profiles each @nsight.analyze.kernel decorated function in a separate script execution. While this allows individual functions to be profiled, it creates two major issues:

  1. Results Isolation: Each profiling execution only returns results for the currently-profiled function. Other decorated functions in a single script return None during that execution.

  2. Data Inaccessibility: When profiling multiple functions in a single script, results from earlier profiling executions are not accessible in later executions, making it impossible to access all profiling results within a single script run.

The root cause is that @nsight.analyze.kernel triggers a separate script execution for each decorated function it profiles.

Current Behavior

import torch
import nsight

sizes = [(2**i,) for i in range(11, 14)]

@nsight.analyze.kernel(configs=sizes, runs=10)
def kernel1(n: int) -> None:
    a = torch.randn(n, n, device="cuda")
    b = torch.randn(n, n, device="cuda")
    with nsight.annotate("matmul"):
        _ = a @ b

@nsight.analyze.kernel(configs=sizes, runs=10)
def kernel2(n: int) -> None:
    a = torch.randn(n, n, device="cuda")
    b = torch.randn(n, n, device="cuda")
    with nsight.annotate("matmul"):
        _ = a @ b

def main() -> None:
    # When profiling kernel1, this will be executed and returns a ProfilerResult object.
    # When profiling kernel2, this will also be executed (the second execution pass), but kernel1 returns None.
    result1 = kernel1()
    print("Kernel1 results:", result1.to_dataframe())  # When profiling kernel2, this cannot be accessed

    # This works - kernel2 returns a ProfilerResult object.
    result2 = kernel2()
    print("Kernel2 results:", result2.to_dataframe())  # When profiling kernel2, this can be accessed

if __name__ == "__main__":
    main()

Expected Behavior

Both profiling functions should return valid ProfilerResult objects that can be accessed independently:

# User can access multiple results.
results = []
results.append(kernel1())
results.append(kernel2())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions