Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gigantic scores of DB12 recorded by DIRAC Pilots (from 50 up to 2250) #7760

Open
IgorPelevanyuk opened this issue Aug 22, 2024 · 10 comments
Open

Comments

@IgorPelevanyuk
Copy link
Contributor

Hello,

Recently I have noticed unrealisticly big values of DB12 benchmark. Normally it shold be somewhere between 0 and 45. But I noticed that during this summer we had pilots with jobs that recorded values up to 2250.

So far it is observed only on one computing resource with AlmaLinux and CPU Model Intel Gold 6242R.
Here are zoomed version from our local monitoring. Every dot is a job, color is CPU model, X is DB12 value, Y is job walltime.
image
I have checked the databases:

mysql> use JobDB;
mysql> select JobID, CONVERT(Value USING utf8) from JobParameters where Name="CPUNormalizationFactor" and CONVERT(Value USING utf8)>45;
+---------+---------------------------+
| JobID   | CONVERT(Value USING utf8) |
+---------+---------------------------+
| 4298295 | 60.7                      |
| 4298419 | 45.8                      |
| 4298478 | 48.8                      |
| 4298494 | 48.8                      |
+---------+---------------------------+
4 rows in set (0.48 sec)

Unfortunately, jobs with the craziest DB12 benchmark was cleand from DIRAC and there are only resent jobs with wrong DB12 benchmark. But I was able to grep DIRAC Pilot logs on the computing resource to get this:

2024-06-28T08:38:56.134271Z INFO [ConfigureCPURequirements] Command ConfigureCPURequirements instantiated from pilotCommands
2024-06-28T08:38:56.135707Z INFO [ConfigureCPURequirements] Executing command dirac-wms-cpu-normalization -U  -R pilot.cfg --cfg pilot.cfg -d
Computing benchmark measurements on 1 processor(s)...
Applying a correction on the CPU power: 1.0
Estimated CPU power is 2272.7 HS06
Configuration file dumped 'pilot.cfg'

2024-06-28T08:39:17.640561Z INFO [ConfigureCPURequirements] Current normalized CPU as determined by 'dirac-wms-cpu-normalization' is 2272.700000

Does anybody saw with this before?

@IgorPelevanyuk
Copy link
Contributor Author

I also have to notice that the working nodes are virtualized with KVM.
I did small research about this problem. Benchmark runs from CVMFS version of DIRAC. And sometimes, when CVMFS is doing something as a process on the worknode, the benchmark processes start to consume more than 100% of CPU in htop.
I created simple script to run 40 dirac-wms-cpu-normalization processes in a background. This scrip was executed consequently several times. On the first screenshot there are previous results of normalization (left) and htop monitoring of current script run.
Screenshot from 2024-08-23 14-31-18

On the second screenshot there are results of 40 dirac-wms-cpu-normalization which were received after the test.
Screenshot from 2024-08-23 14-31-24

As you can see, the normal benchmark is within 35-37 range. When "something" happens, it may grow up to 55. And even more when we have longer test done by real dirac pilots.

@IgorPelevanyuk
Copy link
Contributor Author

And I also created a test version of DB12 benchmark to test it. And so far it looks like the os.times function works wrong:

The norm calculated as:
CodeCogsEqn

To record times os.times() function is used. Start values are recorded in the beginnig of benchmark workload iterations. End values are recorded after benchmark workload. In the following listing CPU represents exactly that: (end u+s+cu+cs) - (start u+s+cu+cs). And I was able to record really strange values:

7 CPU: 9.65     WALL: 12.94     NORM: 25.91     ST: u: 7.14 s: 0.01 cu:0.0 cs:0.0 E: u: 16.79 s: 0.01 cu:0.0 cs:0.0
10 CPU: 9.19    WALL: 13.21     NORM: 27.20     ST: u: 7.18 s: 0.01 cu:0.0 cs:0.0 E: u: 16.37 s: 0.01 cu:0.0 cs:0.0
20 CPU: 7.13    WALL: 13.63     NORM: 35.06     ST: u: 7.11 s: 0.01 cu:0.0 cs:0.0 E: u: 14.1 s: 0.15 cu:0.0 cs:0.0
 4 CPU: 6.66    WALL: 12.87     NORM: 37.54     ST: u: 7.26 s: 0.01 cu:0.0 cs:0.0 E: u: 13.92 s: 0.01 cu:0.0 cs:0.0
 3 CPU: 8.71    WALL: 13.29     NORM: 28.70     ST: u: 7.12 s: 0.0 cu:0.0 cs:0.0 E: u: 15.83 s: 0.0 cu:0.0 cs:0.0
 1 CPU: 6.79    WALL: 13.04     NORM: 36.82     ST: u: 7.22 s: 0.01 cu:0.0 cs:0.0 E: u: 14.01 s: 0.01 cu:0.0 cs:0.0
31 CPU: 4.75    WALL: 13.82     NORM: 52.63     ST: u: 7.15 s: 0.0 cu:0.0 cs:0.0 E: u: 11.9 s: 0.0 cu:0.0 cs:0.0
17 CPU: 8.74    WALL: 12.91     NORM: 28.60     ST: u: 6.63 s: 0.0 cu:0.0 cs:0.0 E: u: 15.37 s: 0.0 cu:0.0 cs:0.0
19 CPU: 6.92    WALL: 13.51     NORM: 36.13     ST: u: 7.51 s: 0.01 cu:0.0 cs:0.0 E: u: 14.42 s: 0.02 cu:0.0 cs:0.0
39 CPU: 3.92    WALL: 13.40     NORM: 63.78     ST: u: 7.24 s: 0.01 cu:0.0 cs:0.0 E: u: 11.16 s: 0.01 cu:0.0 cs:0.0
22 CPU: 4.09    WALL: 14.45     NORM: 61.12     ST: u: 7.14 s: 0.0 cu:0.0 cs:0.0 E: u: 11.23 s: 0.0 cu:0.0 cs:0.0
24 CPU: 7.44    WALL: 12.40     NORM: 33.60     ST: u: 7.08 s: 0.49 cu:0.0 cs:0.0 E: u: 14.51 s: 0.5 cu:0.0 cs:0.0
23 CPU: 2.45    WALL: 12.56     NORM: 102.04    ST: u: 7.31 s: 0.01 cu:0.0 cs:0.0 E: u: 9.76 s: 0.01 cu:0.0 cs:0.0
35 CPU: 2.93    WALL: 12.25     NORM: 85.32     ST: u: 7.56 s: 0.03 cu:0.0 cs:0.0 E: u: 10.49 s: 0.03 cu:0.0 cs:0.0
32 CPU: 12.03   WALL: 12.41     NORM: 20.78     ST: u: 7.38 s: 0.01 cu:0.0 cs:0.0 E: u: 19.41 s: 0.01 cu:0.0 cs:0.0
21 CPU: 6.72    WALL: 12.76     NORM: 37.20     ST: u: 6.89 s: 0.0 cu:0.0 cs:0.0 E: u: 13.61 s: 0.0 cu:0.0 cs:0.0
 5 CPU: 2.70    WALL: 13.76     NORM: 92.59     ST: u: 7.52 s: 0.01 cu:0.0 cs:0.0 E: u: 10.22 s: 0.01 cu:0.0 cs:0.0
26 CPU: 6.89    WALL: 12.13     NORM: 36.28     ST: u: 7.36 s: 0.01 cu:0.0 cs:0.0 E: u: 14.25 s: 0.01 cu:0.0 cs:0.0
12 CPU: 2.45    WALL: 15.09     NORM: 102.04    ST: u: 6.73 s: 0.01 cu:0.0 cs:0.0 E: u: 9.18 s: 0.01 cu:0.0 cs:0.0
38 CPU: 1.69    WALL: 12.61     NORM: 147.93    ST: u: 7.18 s: 0.0 cu:0.0 cs:0.0 E: u: 8.87 s: 0.0 cu:0.0 cs:0.0
11 CPU: 6.73    WALL: 12.72     NORM: 37.15     ST: u: 7.53 s: 0.0 cu:0.0 cs:0.0 E: u: 14.25 s: 0.01 cu:0.0 cs:0.0
33 CPU: 6.84    WALL: 12.10     NORM: 36.55     ST: u: 6.38 s: 0.0 cu:0.0 cs:0.0 E: u: 13.22 s: 0.0 cu:0.0 cs:0.0
36 CPU: 6.74    WALL: 12.25     NORM: 37.09     ST: u: 7.42 s: 0.01 cu:0.0 cs:0.0 E: u: 14.16 s: 0.01 cu:0.0 cs:0.0
37 CPU: 6.66    WALL: 12.16     NORM: 37.54     ST: u: 7.44 s: 0.01 cu:0.0 cs:0.0 E: u: 14.1 s: 0.01 cu:0.0 cs:0.0
 6 CPU: 6.53    WALL: 12.13     NORM: 38.28     ST: u: 7.37 s: 0.0 cu:0.0 cs:0.0 E: u: 13.9 s: 0.0 cu:0.0 cs:0.0
30 CPU: 6.62    WALL: 11.91     NORM: 37.76     ST: u: 6.87 s: 0.01 cu:0.0 cs:0.0 E: u: 13.49 s: 0.01 cu:0.0 cs:0.0
34 CPU: 12.15   WALL: 13.94     NORM: 20.58     ST: u: 6.9 s: 0.0 cu:0.0 cs:0.0 E: u: 19.04 s: 0.01 cu:0.0 cs:0.0
25 CPU: 6.76    WALL: 12.28     NORM: 36.98     ST: u: 6.85 s: 0.0 cu:0.0 cs:0.0 E: u: 13.61 s: 0.0 cu:0.0 cs:0.0
13 CPU: 3.49    WALL: 6.64      NORM: 71.63     ST: u: 12.65 s: 0.01 cu:0.0 cs:0.0 E: u: 16.14 s: 0.01 cu:0.0 cs:0.0
 9 CPU: 10.24   WALL: 17.11     NORM: 24.41     ST: u: 7.19 s: 0.01 cu:0.0 cs:0.0 E: u: 17.43 s: 0.01 cu:0.0 cs:0.0
40 CPU: 12.53   WALL: 16.63     NORM: 19.95     ST: u: 7.47 s: 0.0 cu:0.0 cs:0.0 E: u: 19.99 s: 0.01 cu:0.0 cs:0.0
28 CPU: 3.93    WALL: 6.94      NORM: 63.61     ST: u: 10.16 s: 0.38 cu:0.0 cs:0.0 E: u: 14.09 s: 0.38 cu:0.0 cs:0.0
18 CPU: 4.12    WALL: 6.76      NORM: 60.68     ST: u: 6.86 s: 0.0 cu:0.0 cs:0.0 E: u: 10.98 s: 0.0 cu:0.0 cs:0.0
15 CPU: 2.18    WALL: 7.31      NORM: 114.68    ST: u: 7.37 s: 0.0 cu:0.0 cs:0.0 E: u: 9.55 s: 0.0 cu:0.0 cs:0.0
16 CPU: 9.85    WALL: 17.35     NORM: 25.38     ST: u: 7.2 s: 0.01 cu:0.0 cs:0.0 E: u: 17.05 s: 0.01 cu:0.0 cs:0.0
 8 CPU: 5.07    WALL: 6.71      NORM: 49.31     ST: u: 9.48 s: 0.01 cu:0.0 cs:0.0 E: u: 14.55 s: 0.01 cu:0.0 cs:0.0
14 CPU: 5.66    WALL: 8.05      NORM: 44.17     ST: u: 6.68 s: 0.01 cu:0.0 cs:0.0 E: u: 12.34 s: 0.01 cu:0.0 cs:0.0
 2 CPU: 6.11    WALL: 6.82      NORM: 40.92     ST: u: 7.66 s: 0.01 cu:0.0 cs:0.0 E: u: 13.77 s: 0.01 cu:0.0 cs:0.0
29 CPU: 6.15    WALL: 6.37      NORM: 40.65     ST: u: 12.18 s: 0.01 cu:0.0 cs:0.0 E: u: 18.33 s: 0.01 cu:0.0 cs:0.0
27 CPU: 5.55    WALL: 6.31      NORM: 45.05     ST: u: 11.84 s: 0.01 cu:0.0 cs:0.0 E: u: 17.39 s: 0.01 cu:0.0 cs:0.0

That is listing when estimations are done sequentially:

  1 CPU: 6.29    WALL: 6.31      NORM: 39.75     ST: u: 6.24 s: 0.0 cu:0.0 cs:0.0 E: u: 12.53 s: 0.0 cu:0.0 cs:0.0
 2 CPU: 6.29    WALL: 6.31      NORM: 39.75     ST: u: 6.26 s: 0.0 cu:0.0 cs:0.0 E: u: 12.55 s: 0.0 cu:0.0 cs:0.0
 3 CPU: 6.42    WALL: 6.44      NORM: 38.94     ST: u: 6.21 s: 0.0 cu:0.0 cs:0.0 E: u: 12.63 s: 0.0 cu:0.0 cs:0.0
 4 CPU: 6.21    WALL: 6.24      NORM: 40.26     ST: u: 6.2 s: 0.0 cu:0.0 cs:0.0 E: u: 12.41 s: 0.0 cu:0.0 cs:0.0
 5 CPU: 6.21    WALL: 6.23      NORM: 40.26     ST: u: 6.18 s: 0.0 cu:0.0 cs:0.0 E: u: 12.39 s: 0.0 cu:0.0 cs:0.0
 6 CPU: 6.23    WALL: 6.25      NORM: 40.13     ST: u: 6.43 s: 0.0 cu:0.0 cs:0.0 E: u: 12.66 s: 0.0 cu:0.0 cs:0.0
 7 CPU: 6.24    WALL: 6.27      NORM: 40.06     ST: u: 6.25 s: 0.0 cu:0.0 cs:0.0 E: u: 12.49 s: 0.0 cu:0.0 cs:0.0
 8 CPU: 6.29    WALL: 6.31      NORM: 39.75     ST: u: 6.3 s: 0.0 cu:0.0 cs:0.0 E: u: 12.59 s: 0.0 cu:0.0 cs:0.0
 9 CPU: 6.24    WALL: 6.26      NORM: 40.06     ST: u: 6.2 s: 0.0 cu:0.0 cs:0.0 E: u: 12.44 s: 0.0 cu:0.0 cs:0.0
10 CPU: 6.23    WALL: 6.25      NORM: 40.13     ST: u: 6.2 s: 0.0 cu:0.0 cs:0.0 E: u: 12.43 s: 0.0 cu:0.0 cs:0.0
11 CPU: 6.24    WALL: 6.27      NORM: 40.06     ST: u: 6.26 s: 0.0 cu:0.0 cs:0.0 E: u: 12.5 s: 0.0 cu:0.0 cs:0.0
12 CPU: 6.25    WALL: 6.26      NORM: 40.00     ST: u: 6.22 s: 0.0 cu:0.0 cs:0.0 E: u: 12.47 s: 0.0 cu:0.0 cs:0.0

Normally CPU is around 6.25. But for the biggest benchmark from the first listing the CPU value may be 2. Which means that the benchmark workload was executed 3 times faster than it should, which is impossible.
I believe that this situation occures only when there are more processes than the cores. As shown on the screenshot of my previous post, the CVMFS process started to run alongside with benchmark tests.

It is the first time I observe this within DIRAC environment. And it is only on one particular resource. It is counter intuitive, because benchmark results are used to calculate normalized CPU time. But looks that if on some resource more than 1 job running on one core, the benchmark could be higher!

I will try to make more tests on the other server which is neither KVM virtualized, nor Alma Linux.

BTW: DIRAC version v8.0.36

@fstagni
Copy link
Contributor

fstagni commented Aug 23, 2024

Thank you @IgorPelevanyuk for the notification and the extensive tests. At a minimum, they show that there some fixes to do. Please report back your findings and your suggestion.

EDIT: we do not have much development workforce at the moment, so you are of course welcome to take care of what will need to be done.

@IgorPelevanyuk
Copy link
Contributor Author

Hello, @fstagni!

So far, I think that the main culprit of the issues is ProxmoxVE on Debian. No similar issues found on the same cluster but with Libvirt on Fedora. We will try to make more tests to understand the reason(in two weeks), why Proxmox behaves so strange.

It is hard to tell somethin about solution. The issue appears only in a very specific virtual environment. I hope that it can be fixed by some setting in Proxmox.

@fstagni
Copy link
Contributor

fstagni commented Sep 13, 2024

Just for info, are you also using cgroups on these cluster?

@IgorPelevanyuk
Copy link
Contributor Author

Slurm on this particular resource is configured to use cgroups. But other Slurm resources also use it.

@IgorPelevanyuk
Copy link
Contributor Author

After several other tests the issue still persist.
And I can propose only one thing - move from os.time(), which in that particular case is innacurate.

Right now, advantage (or not?) of DIRACBenchmark is the fact, that you can run it even on 100% loaded cluster, and because the benchmark do not rely on walltime during its estimations, you can get rather accurate result.

Example: on a server that are free from load you may get DB12 result for 30 seconds. After that you run some processes and overload the server so that 100% of all cores are used and after that start DB12 benchmark. In that case it would take longer for benchmark to perform test because from time to time it will be interupted by other processes on the server, so the benchmark test would last for 60 seconds. But in the end, you will get the same results of DB12. In this example I am neglecting the fact that frequency of an empty processor is usually much higher.

If we would like to use walltime in Python there are time.perf_counter() function. In that case we would get benchmark results more related to the real speed of the server. But it will not be possible to get more or less accurate result in case if you want to benchmark server that is already 100% loaded.

Honestly, I do not fully understand why using cputime is better for benchmarking comparing with walltime. The only advantage is a possibility to run benchmark in parallel with some other processes running on the same CPU core.

I think that in our DIRAC installation we may try to tweak DIRACBenchmark script in CVMFS to use walltime and see how it works.

What may be done for the DIRAC in general is writing to JobParametersDB not only DB12 calculated with cputime, but also DB12 calculated with walltime and call it DB12WT or something like this. In that case it will be possible to make some extansive analysis on different DIRAC installations and computing resources. And after that decide if the move from one approach to another is justified.

@fstagni
Copy link
Contributor

fstagni commented Nov 18, 2024

Hi Igor, as discussed in Thursday's meeting: we can insert an alternate CPU power in the Job parameters, please start with this commit: 2de8586

On the DB12 side, you define a new method or a new flag, I let it up to you.

@IgorPelevanyuk
Copy link
Contributor Author

@fstagni, Thank you!

Do you propose to put this code in DIRAC master branch, or it is only a hint for me to change our local DIRAC installation directory placed in CVMFS?

@fstagni
Copy link
Contributor

fstagni commented Nov 28, 2024

You can add it to master branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants