Skip to content

Conversation

@lingbin
Copy link
Contributor

@lingbin lingbin commented Nov 3, 2025

Fixes #26516

All 6 OS-related metrics were defined as AVG type but reported as
delta values, causing incorrect averaging and potential data loss
in Prometheus monitoring.

Changed metrics to report cumulative values since process start:

  • presto_cpp.os_user_cpu_time_micros
  • presto_cpp.os_system_cpu_time_micros
  • presto_cpp.os_num_soft_page_faults
  • presto_cpp.os_num_hard_page_faults
  • presto_cpp.os_num_voluntary_context_switches
  • presto_cpp.os_num_forced_context_switches

This ensures:

  1. Alignment with other AVG metrics in the system (task counts,
    cache sizes, etc.)
  2. Proper rate calculations in monitoring systems and no data loss
    regardless of scraping intervals
== NO RELEASE NOTE ==

All 6 OS-related metrics were defined as **AVG** type but reported as
**delta values**, causing incorrect averaging and potential data loss
in Prometheus monitoring.

Changed metrics to report **cumulative values** since process start:
- presto_cpp.os_user_cpu_time_micros
- presto_cpp.os_system_cpu_time_micros
- presto_cpp.os_num_soft_page_faults
- presto_cpp.os_num_hard_page_faults
- presto_cpp.os_num_voluntary_context_switches
- presto_cpp.os_num_forced_context_switches

This ensures:
1. Alignment with other AVG metrics in the system (task counts,
   cache sizes, etc.)
2. Proper rate calculations in monitoring systems and no data loss
   regardless of scraping intervals
@lingbin lingbin requested review from a team as code owners November 3, 2025 16:24
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 3, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

This PR converts six OS-related metrics from reporting delta values to reporting cumulative values by eliminating subtraction of previous readings and removing obsolete state variables used for delta calculations.

Class diagram for updated PeriodicTaskManager OS metrics logic

classDiagram
class PeriodicTaskManager {
  -lastHttpClientNumConnectionsCreated_: int64_t
  +updateOperatingSystemStats()
  +addOperatingSystemStatsUpdateTask()
}

%% Removed attributes for OS metric deltas
%% lastUserCpuTimeUs_, lastSystemCpuTimeUs_, lastSoftPageFaults_, lastHardPageFaults_, lastVoluntaryContextSwitches_, lastForcedContextSwitches_ are no longer present
Loading

Flow diagram for OS metrics reporting change (delta to cumulative)

flowchart TD
    A["Collect OS metric (e.g., user CPU time)"] --> B["Report cumulative value since process start"]
    B --> C["RECORD_METRIC_VALUE(metric, cumulative_value)"]
    %% Previously: A --> D["Subtract previous value (delta)"] --> C
    %% Now: direct cumulative reporting
Loading

File-Level Changes

Change Details Files
Switch OS metrics reporting from delta to cumulative values
  • Removed subtraction of last recorded values when calling RECORD_METRIC_VALUE
  • Updated RECORD_METRIC_VALUE calls to directly use current usage values for all six metrics
presto_cpp/main/PeriodicTaskManager.cpp
Remove unused state variables for tracking previous metric values
  • Deleted lastUserCpuTimeUs_, lastSystemCpuTimeUs_, lastSoftPageFaults_, lastHardPageFaults_, lastVoluntaryContextSwitches_, and lastForcedContextSwitches_ members
presto_cpp/main/PeriodicTaskManager.h

Assessment against linked issues

Issue Objective Addressed Explanation
#26516 Change all 6 OS-related AVG type metrics to report cumulative values instead of delta values.
#26516 Ensure consistency of OS AVG type metrics with other AVG metrics in the system (i.e., all report cumulative values).
#26516 Prevent data loss in Prometheus monitoring by reporting cumulative values for OS AVG type metrics.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@lingbin
Copy link
Contributor Author

lingbin commented Nov 4, 2025

@majetideepak Could you help review this PR? Thanks.

@lingbin
Copy link
Contributor Author

lingbin commented Nov 5, 2025

@majetideepak @karteekmurthys @aditi-pandit Kindly ping. Could you please review this PR? This issue affects the accuracy of Prometheus monitoring metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[native] OS Metrics(AVG Type Counters) should not report delta values

1 participant