You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge 'Address non-monotonicity of steal time and other issues' from Travis Downs
This series primarily addresses the problem that on systems with low amounts of steal, steal time appears negative (i.e., the cumulative steal time counter goes down from sample to sample).
This wrong on the face of it and also causes serious problems as a metric in prometheus since the counter contract (monotonic increase) is violated. This causes spurious "counter reset" detection in prometheus and hence bogus very large or very small steal time results in `rate` (or similar) queries.
This is addressed in two ways:
- We make the sleep time calculation more accurate, which is the underlying reason for negative steal which reduces the error (and so "negativeness") of steal by a couple orders of magnitude. After this change steal time is often 0 when rounded to the nearest ms where it wasn't before.
- Because the reduction above still does not prevent small negative steal completely, we change the implementation of the metric to essentially cap steal from below 0 in periods where steal was negative.
The individual changes have further details. I am open to splitting commits that may be less popular or require more discussion into a different PR if it makes sense.
Closes#2390
* https://github.com/scylladb/seastar:
Make total_steal_time() monotonic.
Remove account_idle
reactor: add better sleep time accounting
reactor: add cpu and awake time reactor metrics
Zero-init total sleep time
sm::description("Total steal time, the time in which some other process was running while Seastar was not trying to run (not sleeping)."
2520
+
sm::description("Total steal time, the time in which something else was running while the reactor was runnable (not sleeping)."
2520
2521
"Because this is in userspace, some time that could be legitimally thought as steal time is not accounted as such. For example, if we are sleeping and can wake up but the kernel hasn't woken us up yet.")),
0 commit comments