Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in estimating multithreaded virtualisation processes #335

Open
malinvaudpaul opened this issue Jul 13, 2023 · 1 comment
Open
Labels
bug Something isn't working

Comments

@malinvaudpaul
Copy link

Hello, I am using Scaphandre (v0.5.0) to estimate the power consumption of VMs. I'm wondering about the accuracy of the data returned for a multithreaded QEMU process.

Capture d’écran 2023-07-12 à 18 42 08

Here's the situation, I'm on a Kernel 5.15.50, on which I'm running a multithreaded QEMU virtualisation process (yellow line). The green line is the metric scaph_host_power_microwatts. I run the command stress-ng -c 1 (blue line), first on the host only (first period), then simultaneously on the host and the VM (second period) and finally on the VM only (third period).
You'd expect to see symmetry in relation to the 16:33:30 point, but the same system stress on the host or the VM doesn't seem to be evaluated in the same way by Scaphandre at the level of the QEMU process, whereas it's estimated to be more or less of the same order of magnitude at host level.

Note that for the last period, the qemu-kvm process represented as much as the stress-ng-cpu process among the other processes in the system over the first period, something like 98% of the sum of all of them.

I therefore have the impression that Scaphandre has a blind spot when it comes to estimating multithreaded processes, which I can't yet explain.

@malinvaudpaul malinvaudpaul added the bug Something isn't working label Jul 13, 2023
@bpetit bpetit added this to General Jun 19, 2024
@bpetit bpetit moved this to Triage in General Jun 19, 2024
@bpetit
Copy link
Contributor

bpetit commented Oct 17, 2024

Hi, This behavior seems like something that led to other issues in the past, and has been fixed in 1.0. Could you upgrade and see how it goes ?

To clarify, this documentation page explains the global behavior of the process* and host metrics.

I'd be keen to read about further tests. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Triage
Development

No branches or pull requests

2 participants