Feature request: offline CPU handling #873

mjtrangoni · 2018-03-29T14:01:24Z

Host operating system: output of `uname -a`

Linux xxxx 3.10.0-693.2.2.el7.ppc64le #1 SMP Sat Sep 9 03:58:38 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux

node_exporter version: output of `node_exporter --version`

node_exporter, version 0.16.0-rc.0 (branch: build, revision: 8ec35dfcd0aaa05b6039fc3c4bef7a675d419f6b)
  go version:       go1.10

node_exporter command line flags

default

Are you running node_exporter in Docker?

no

What did you do that produced an error?

none

What did you expect to see?

This PPC server has SMT=2 (Simultaneous multithreading) which can scale on-the-fly up to 8x.

# ppc64_cpu --smt
SMT=2
# ppc64_cpu --info
Core   0:    0*    1*    2     3     4     5     6     7
Core   1:    8*    9*   10    11    12    13    14    15
Core   2:   16*   17*   18    19    20    21    22    23
Core   3:   24*   25*   26    27    28    29    30    31
Core   4:   32*   33*   34    35    36    37    38    39
Core   5:   40*   41*   42    43    44    45    46    47
Core   6:   48*   49*   50    51    52    53    54    55
Core   7:   56*   57*   58    59    60    61    62    63
Core   8:   64*   65*   66    67    68    69    70    71
Core   9:   72*   73*   74    75    76    77    78    79
Core  10:   80*   81*   82    83    84    85    86    87
Core  11:   88*   89*   90    91    92    93    94    95
Core  12:   96*   97*   98    99   100   101   102   103
Core  13:  104*  105*  106   107   108   109   110   111
Core  14:  112*  113*  114   115   116   117   118   119
Core  15:  120*  121*  122   123   124   125   126   127
Core  16:  128*  129*  130   131   132   133   134   135
Core  17:  136*  137*  138   139   140   141   142   143
Core  18:  144*  145*  146   147   148   149   150   151
Core  19:  152*  153*  154   155   156   157   158   159

# lscpu
Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                160
On-line CPU(s) list:   0,1,8,9,16,17,24,25,32,33,40,41,48,49,56,57,64,65,72,73,80,81,88,89,96,97,104,105,112,113,120,121,128,129,136,137,144,145,152,153
Off-line CPU(s) list:  2-7,10-15,18-23,26-31,34-39,42-47,50-55,58-63,66-71,74-79,82-87,90-95,98-103,106-111,114-119,122-127,130-135,138-143,146-151,154-159
Thread(s) per core:    2
Core(s) per socket:    5
Socket(s):             4
NUMA node(s):          4
Model:                 2.1 (pvr 004b 0201)
Model name:            POWER8E (raw), altivec supported
L1d cache:             64K
L1i cache:             32K
L2 cache:              512K
L3 cache:              8192K
NUMA node0 CPU(s):     0,1,8,9,16,17,24,25,32,33
NUMA node1 CPU(s):     40,41,48,49,56,57,64,65,72,73
NUMA node16 CPU(s):    80,81,88,89,96,97,104,105,112,113
NUMA node17 CPU(s):    120,121,128,129,136,137,144,145,152,153

# ppc64_cpu --smt=8
# ppc64_cpu --info
Core   0:    0*    1*    2*    3*    4*    5*    6*    7*
Core   1:    8*    9*   10*   11*   12*   13*   14*   15*
Core   2:   16*   17*   18*   19*   20*   21*   22*   23*
Core   3:   24*   25*   26*   27*   28*   29*   30*   31*
Core   4:   32*   33*   34*   35*   36*   37*   38*   39*
Core   5:   40*   41*   42*   43*   44*   45*   46*   47*
Core   6:   48*   49*   50*   51*   52*   53*   54*   55*
Core   7:   56*   57*   58*   59*   60*   61*   62*   63*
Core   8:   64*   65*   66*   67*   68*   69*   70*   71*
Core   9:   72*   73*   74*   75*   76*   77*   78*   79*
Core  10:   80*   81*   82*   83*   84*   85*   86*   87*
Core  11:   88*   89*   90*   91*   92*   93*   94*   95*
Core  12:   96*   97*   98*   99*  100*  101*  102*  103*
Core  13:  104*  105*  106*  107*  108*  109*  110*  111*
Core  14:  112*  113*  114*  115*  116*  117*  118*  119*
Core  15:  120*  121*  122*  123*  124*  125*  126*  127*
Core  16:  128*  129*  130*  131*  132*  133*  134*  135*
Core  17:  136*  137*  138*  139*  140*  141*  142*  143*
Core  18:  144*  145*  146*  147*  148*  149*  150*  151*
Core  19:  152*  153*  154*  155*  156*  157*  158*  159*
# lscpu 
Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                160
On-line CPU(s) list:   0-159
Thread(s) per core:    8
Core(s) per socket:    5
Socket(s):             4
NUMA node(s):          4
Model:                 2.1 (pvr 004b 0201)
Model name:            POWER8E (raw), altivec supported
L1d cache:             64K
L1i cache:             32K
L2 cache:              512K
L3 cache:              8192K
NUMA node0 CPU(s):     0-39
NUMA node1 CPU(s):     40-79
NUMA node16 CPU(s):    80-119
NUMA node17 CPU(s):    120-159

In the 'SMT=2' case there are 960 metrics we could ignore (4 sockets * 5 cores * 6 (8-2) threads * 8 modes).

# curl -s localhost:9100/metrics | egrep -w -v -e '(HELP|TYPE)' | grep node_cpu_seconds_total | wc -l
1280

My feature request is to reduce the amount of CPU metrics. There are 2 alternatives that come to mind,

Ignoring the offline CPUs in the node_exporter
Introducing a new label, online="0|1", and filtering during Prometheus scrape process.

What did you want to see instead?

# curl -s localhost:9100/metrics | egrep -w -v -e '(HELP|TYPE)' | grep node_cpu_seconds_total | grep 'online=1 ' | wc -l
320

The text was updated successfully, but these errors were encountered:

SuperQ · 2018-03-29T14:05:04Z

Interesting, we get CPU metrics from /proc/stat. Is there an online/offline status file somewhere in /proc or /sys we can get this information from? I don't have access to any hardware like this to investigate the options.

mjtrangoni · 2018-03-29T14:11:39Z

Hi @SuperQ , you can check this so,

# grep . /sys/devices/system/cpu/cpu*/online|head
/sys/devices/system/cpu/cpu0/online:1
/sys/devices/system/cpu/cpu100/online:0
/sys/devices/system/cpu/cpu101/online:0
/sys/devices/system/cpu/cpu102/online:0
/sys/devices/system/cpu/cpu103/online:0
/sys/devices/system/cpu/cpu104/online:1
/sys/devices/system/cpu/cpu105/online:1
/sys/devices/system/cpu/cpu106/online:0
/sys/devices/system/cpu/cpu107/online:0
/sys/devices/system/cpu/cpu108/online:0

brian-brazil · 2018-03-29T14:21:11Z

Ignoring the offline CPUs in the node_exporter
Introducing a new label, online="0|1", and filtering during Prometheus scrape process.

Neither of these make sense semantically, the information for each cpu needs to be always there or always not there.
We could expose information about which/how many cpus are online.

knweiss · 2018-03-29T14:22:29Z

Notice, that e.g. a broken CPU cache can trigger the offlining of a CPU during runtime, too.

I.e. the ability to count the number of online and offline CPUs would be useful for alerting in this case, too. Here's an example (x86_64!):

Mar 24 06:24:42 i Threshold based error status: yellow
Mar 24 06:24:42 i mcelog: Large number of corrected cache errors. System operating, but might lead
Mar 24 06:24:42 i mcelog: to uncorrected errors soon
Mar 24 06:24:42 i mcelog: MCA: Data CACHE Level-1 Data-Read Error
Mar 24 06:24:42 i mcelog: CPU 22 on socket 1 has large number of corrected cache errors in Level-1 Data
Mar 24 06:24:42 i mcelog: System operating correctly, but might lead to uncorrected cache errors soon
Mar 24 06:24:42 i mcelog: Running trigger `cache-error-trigger'
Mar 24 06:24:42 i mcelog: STATUS 8c40004000100135 MCGSTATUS 0
Mar 24 06:24:42 i mcelog: MCGCAP f000c14 APICID 38 SOCKETID 1
Mar 24 06:24:42 i mcelog: PPIN 8c20004000101151
Mar 24 06:24:42 i mcelog: CPUID Vendor Intel Family 6 Model 85
Mar 24 06:24:42 i mcelog: Offlining CPU 22 due to cache error threshold
Mar 24 06:24:42 i kernel: intel_pstate CPU 22 exiting
Mar 24 06:24:42 i kernel: smpboot: CPU 22 is now offline
Mar 24 06:24:42 i mcelog: Offlining CPU 46 due to cache error threshold

On CentOS the script that offlines a CPU in this case can be found here:

# cat /etc/mcelog/triggers/cache-error-trigger
#!/bin/bash
# cache error trigger. This shell script is executed by mcelog in daemon mode
# when a CPU reports excessive corrected cache errors. This could be a indication
# for future uncorrected errors.
[...]
for i in $AFFECTED_CPUS ; do
        logger -s -p daemon.crit -t mcelog "Offlining CPU $i due to cache error threshold"
        F=$(printf "/sys/devices/system/cpu/cpu%d/online" $i)
        echo 0 > $F
[...]
done

SuperQ · 2018-03-29T16:25:13Z

Yes, I think a separate bool metric is the right thing to do here.

node_cpu_online{cpu="x"}

Looking at some of my systems, none of them have and online indicator. This seems to be somewhat platform dependent.

What do we want to do if the cpu is offline, should we stop exposing /proc/stat counters for these CPUs? It seems reasonable to me, if we have the new bool metric.

brian-brazil · 2018-03-29T16:26:14Z

What do we want to do if the cpu is offline, should we stop exposing /proc/stat counters for these CPUs? It seems reasonable to me, if we have the new bool metric.

That can cause problems with rates, they should stay exposed. Constant time series are cheap to store anyway.

mjtrangoni · 2018-04-05T08:54:15Z

Hi @SuperQ @brian-brazil ,

I did some research about this topic, and also read the Linux documentation here. See,

Additionally, CPU topology information is provided under
/sys/devices/system/cpu and includes these files.  The internal
source for the output is in brackets ("[]").

    =========== ==========================================================
    kernel_max: the maximum CPU index allowed by the kernel configuration.
		[NR_CPUS-1]

    offline:	CPUs that are not online because they have been
		HOTPLUGGED off (see cpu-hotplug.txt) or exceed the limit
		of CPUs allowed by the kernel configuration (kernel_max
		above). [~cpu_online_mask + cpus >= NR_CPUS]

    online:	CPUs that are online and being scheduled [cpu_online_mask]

    possible:	CPUs that have been allocated resources and can be
		brought online if they are present. [cpu_possible_mask]

    present:	CPUs that have been identified as being present in the
		system. [cpu_present_mask]
=========== ==========================================================

I think, we could iterate over /sys/devices/system/cpu/present, and set to 1 all that are also online here /sys/devices/system/cpu/online exposing node_cpu_online{cpu="x"}="0|1".

You can see this more detailed here.

Not every Hardware shows cpu0.
offline and possible are a bit tricky, see AMD EPYC and Intel Skylake. They are not always present and show the architecture's maximum CPUs.
This is only checked on CentOS6.9 and CentOS7.4.

Another approach, thinking on sparing metrics, would be having something like this,

node_cpu_online_count=XX
node_cpu_present_count=XX

I will be making a PR on this soon if you agree.

discordianfish · 2018-08-18T10:52:31Z

Agree that a separate node_cpu_online metric makes sense and that we should still keep exposing the stale values for offline cpu's as @brian-brazil suggested.

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

SuperQ added enhancement accepted labels Mar 29, 2018

mjtrangoni mentioned this issue Apr 3, 2018

Support metrics for offline CPUs prometheus/procfs#84

Open

discordianfish added the help wanted label Aug 18, 2018

rexagod added a commit to rexagod/procfs that referenced this issue May 30, 2024

enhancement: Expose CPU online status

178f736

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

rexagod added a commit to rexagod/procfs that referenced this issue May 30, 2024

enhancement: Expose CPU online status

04cdb70

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

rexagod mentioned this issue May 30, 2024

enhancement: Expose CPU online status prometheus/procfs#644

Merged

rexagod added a commit to rexagod/procfs that referenced this issue May 30, 2024

enhancement: Expose CPU online status

9687001

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

rexagod added a commit to rexagod/procfs that referenced this issue May 30, 2024

enhancement: Expose CPU online status

5a9ca2a

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

rexagod linked a pull request May 30, 2024 that will close this issue

collector/cpu: Support CPU online status #3032

Open

SuperQ pushed a commit to prometheus/procfs that referenced this issue Jun 3, 2024

enhancement: Expose CPU online status (#644)

1754b78

Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: offline CPU handling #873

Feature request: offline CPU handling #873

mjtrangoni commented Mar 29, 2018 •

edited

Loading

SuperQ commented Mar 29, 2018

mjtrangoni commented Mar 29, 2018

brian-brazil commented Mar 29, 2018

knweiss commented Mar 29, 2018

SuperQ commented Mar 29, 2018

brian-brazil commented Mar 29, 2018

mjtrangoni commented Apr 5, 2018

discordianfish commented Aug 18, 2018

Feature request: offline CPU handling #873

Feature request: offline CPU handling #873

Comments

mjtrangoni commented Mar 29, 2018 • edited Loading

Host operating system: output of uname -a

node_exporter version: output of node_exporter --version

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you want to see instead?

SuperQ commented Mar 29, 2018

mjtrangoni commented Mar 29, 2018

brian-brazil commented Mar 29, 2018

knweiss commented Mar 29, 2018

SuperQ commented Mar 29, 2018

brian-brazil commented Mar 29, 2018

mjtrangoni commented Apr 5, 2018

discordianfish commented Aug 18, 2018

mjtrangoni commented Mar 29, 2018 •

edited

Loading

Host operating system: output of `uname -a`

node_exporter version: output of `node_exporter --version`