Skip to content

Conversation

@ConvolutedDog
Copy link
Contributor

Introduction:

This commit introduces support for multiple derived metrics from a single profiling run and enhances the plotting system with explicit metric selection. Users can now compute multiple related metrics (e.g., TFLOPS and Arithmetic Intensity) in one derivation and selectively visualize them.

Key Changes:

  • Support multiple derived metrics: derive_metric now supports both:
    • Scalar return: Uses function name as metric name
    • Dictionary return: Uses keys as metric names for multiple derived metrics
  • Added metric parameter to @nsight.analyze.plot decorator:
    • metric=None: Plots single metric if only one exists, raises error if multiple
    • metric="metric_name": Plots specified metric from available metrics
  • Removed "Transformed" column and exploded all metrics into "Metric" column:
    • All collected and derived metrics now appear in unified "Metric" column
    • Each metric gets its own row in dataframe, simplifying data structure

Example - Multiple derived metrics in one function:

def compute_metrics(time_ns, n):
     return {
          "TFLOPS": 2*n*n*n/(time_ns/1e9)/1e12,
          "ArithIntensity": (2*n*n*n)/((n*n*3)*4)
     }

@nsight.analyze.plot(
    metric="ArithIntensity",
)
@nsight.analyze.kernel(
    configs=sizes,
    derive_metric=compute_metrics,
)
def profiled_func(n: int) -> None:
    a = torch.randn(n, n, device="cuda")
    b = torch.randn(n, n, device="cuda")

    with nsight.annotate("matmul"):
        _ = a @ b

Examples Update:

  • examples/03_custom_metrics.py: Enhanced with scalar vs dictionary patterns
  • examples/04_multi_parameter.py: Updated to use dictionary return
  • examples/05_subplots.py: Updated to use dictionary return
  • examples/06_plot_customization.py: Updated to use dictionary return
  • examples/09_advanced_metric_custom.py: Enhanced with dictionary return
  • docs/source/overview/architecture.rst: Updated with new patterns

Files Modified:

  • Core modules: analyze.py, collection/core.py, extraction.py, transformation.py, visualization.py
  • Examples: 03_custom_metrics.py, 04_multi_parameter.py, 05_subplots.py, 06_plot_customization.py, 09_advanced_metric_custom.py, 11_output_csv.py
  • Documentation: docs/source/overview/architecture.rst
  • Tests: tests/test_api_params.py, tests/test_collection.py, tests/test_profiler.py

…ngle

profiling run and enhances the plotting system with explicit metric selection.
Users can now compute multiple related metrics (e.g., TFLOPS and Arithmetic
Intensity) in one derivation and selectively visualize them.

- Support multiple derived metrics: `derive_metric` now supports both:
  - Scalar return: Uses function name as metric name
  - Dictionary return: Uses keys as metric names for multiple derived metrics
- Added `metric` parameter to `@nsight.analyze.plot` decorator:
  - `metric=None`: Plots single metric if only one exists, raises error if multiple
  - `metric="metric_name"`: Plots specified metric from available metrics
- Removed "Transformed" column and exploded all metrics into "Metric" column:
  - All collected and derived metrics now appear in unified "Metric" column
  - Each metric gets its own row in dataframe, simplifying data structure

```python
def compute_metrics(time_ns, n):
     return {
          "TFLOPS": 2*n*n*n/(time_ns/1e9)/1e12,
          "ArithIntensity": (2*n*n*n)/((n*n*3)*4)
     }

@nsight.analyze.plot(metric="ArithIntensity")
@nsight.analyze.kernel(derive_metric=compute_metrics)
def profiled_func(n: int) -> None:
    ...
```

- examples/03_custom_metrics.py: Enhanced with scalar vs dictionary patterns
- examples/04_multi_parameter.py: Updated to use dictionary return
- examples/05_subplots.py: Updated to use dictionary return
- examples/06_plot_customization.py: Updated to use dictionary return
- examples/09_advanced_metric_custom.py: Enhanced with dictionary return
- docs/source/overview/architecture.rst: Updated with new patterns

- Core modules: analyze.py, collection/core.py, extraction.py,
  transformation.py, visualization.py
- Examples: 03_custom_metrics.py, 04_multi_parameter.py, 05_subplots.py,
  06_plot_customization.py, 09_advanced_metric_custom.py,
  11_output_csv.py
- Documentation: docs/source/overview/architecture.rst
- Tests: tests/test_api_params.py, tests/test_collection.py,
  tests/test_profiler.py

This update enables richer metric analysis with multiple derived metrics
while providing precise control over visualization selection.

Signed-off-by: ConvolutedDog <[email protected]>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: ConvolutedDog <[email protected]>
@Alok-Joshi
Copy link
Collaborator

/ok to test 518f4ca

@Alok-Joshi
Copy link
Collaborator

@ConvolutedDog can you please merge the latest main branch from upstream? We have fixed the CI issues.

@ConvolutedDog
Copy link
Contributor Author

@Alok-Joshi I've just merged the upstream main branch. Could you please test if the latest CI works?

@Alok-Joshi
Copy link
Collaborator

/ok to test 1ebb37a

@bastianhagedorn
Copy link
Collaborator

LGTM. Thanks @ConvolutedDog for adding this feature.


# Filter by metric
if metric is not None:
agg_df = agg_df[agg_df["Metric"] == metric]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add .copy() here to prevent pandas SettingWithCopyWarning. Since we modify agg_df with new columns later (line 101), we need an explicit copy rather than a view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thanks for the suggestion!

Signed-off-by: ConvolutedDog <[email protected]>
@Alok-Joshi
Copy link
Collaborator

/ok to test 6997a8a

Copy link
Collaborator

@bastianhagedorn bastianhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ConvolutedDog

@Alok-Joshi
Copy link
Collaborator

Merging. Thanks @ConvolutedDog !

@Alok-Joshi Alok-Joshi merged commit 52d9119 into NVIDIA:main Jan 9, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants