-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu: Implement steal/guest/guest_nice counters #2
Conversation
Be aware that the user time counter is actually including guest time, so in the event where you actually use the guest counter for something in the same graph - you should subtract the guest value from the user value as well. |
@kvisle Afaik the real So for the graph it should be fine. (Example graph from Debian Stretch with munin-plugins-core.) Last graph from Muninlite with my PR. But you're right, this could eventually be changed to draw a better graph. For the moment it's just to fill it up to 100%. |
@kvisle: thank you for raising this point. But I failed to find such a connection in @killerbees19: your change proposal looks good to me. Indeed it is very useful when running OpenWrt in a VM. I will merge it, after we finished the cpu/guest discussion. |
That relationship is probably best documented in the source code: https://github.com/torvalds/linux/blob/master/kernel/sched/cputime.c#L139-L156 Regarding 767100, I did some experiments with looking at these counters under different kinds of load - and this is counter-behavior. This is more visible under higher polling resolutions, and under higher load. |
Sorry for letting this rot for so long ... I am about to publish another release of
@kvisle: after taking a look at the example graphs provided by @killerbees19, I fail to understand this connection: the graphs look fine - everything adds up to multiple of 100%. Maybe I am missing something obvious. But nevertheless @killerbees19 seems to agree with @kvisle's point:
I am confused - your help is appreciated :) Given my partial understanding, I see two approaches:
Please enlighten me! |
@killerbees19: ping? |
Oops, sorry! I missed this one. I'll have a look at it again, but currently I'm a little bit busy. I'm not 100% sure what's the correct answer. |
@killerbees19: ping? |
Thanks @kimheino for bringing this to my attention again. I'm sorry for the long delay! 🥺 Test scriptCommand (originally it's a one-liner) to watch t=-2; last=;
while true; do
t=$((t + 1));
if [ $((t % 10)) -eq 0 ]; then
echo
echo " us ni sy id wa hi si st gu gn";
fi;
current=$(grep '^cpu ' /proc/stat | cut -c6-);
if [ ! -z "$last" ]; then
for i in `seq 1 10`; do
a=$(echo "$last" | cut -d ' ' -f $i);
b=$(echo "$current" | cut -d ' ' -f $i);
echo -n "$(printf '%5d' "$((b - a))")";
done;
echo;
fi;
last=$current;
sleep 1;
done Test caseOk, here's a dual core KVM host with low load. After 6 lines I've fired up
The last 6 lines are showing values with stopped workload at the guest. ConclusionI still agree with @kvisle in the point, that We could ignore this fact and just go on. But strictly speaking the graphs are wrong! Our current drawing situation at CPU plugin(s):
I think something like this would be better:
I've no clue if this is even possible or how to change the config output to achieve this. Any ideas? Alternate optionJust merge the PR and think about it later 😝 There's little to no difference to have perfect graphs. RAW values from all fields are valid (as someone would expect it from It's up to you. RemarkIf we'd recalculate |
Any progress on this PR? |
It's not my turn anymore, right? |
@killerbees19 can you rebase on current master and squash the commits (or give maintainers the ability to update this pull request)? Thanks. |
steal (since Linux 2.6.11) (8) Stolen time, which is the time spent in other operating systems when running in a virtu‐ alized environment guest (since Linux 2.6.24) (9) Time spent running a virtual CPU for guest operating systems under the control of the Linux kernel. guest_nice (since Linux 2.6.33) (10) Time spent running a niced guest (virtual CPU for guest operating systems under the con‐ trol of the Linux kernel).
eadf11e
to
5f55dcf
Compare
@kenyon Done |
@killerbees19: Good job, thanks! (Never too late) |
I think it's time to implement these counters after so many years! ;-)
I've tried to maintain backwards-compatibility. This should not break old kernel versions.
Indeed, it's only useful at VM guests or hypervisors…
Successfully tested at OpenWrt 19.07.2 (x86_64 @ KVM)