|
2 | 2 |
|
3 | 3 | ## Overview |
4 | 4 |
|
5 | | -Reports CPU utilization percentages for all available time categories (user, system, idle, nice, iowait, irq, softirq, steal, guest, guest_nice) plus the overall cpu-usage (100 − idle − nice). |
| 5 | +Monitors system-wide CPU utilization with sustained load detection to avoid false alerts from short-lived spikes. Reports percentages across all standard CPU time categories (user, system, idle, nice, iowait, irq, softirq, steal, guest, guest_nice) plus calculated overall CPU usage (100 - idle - nice). |
6 | 6 |
|
7 | | -Thresholds (WARN/CRIT) are checked against user, system, iowait, and cpu-usage. An alert is raised only if the threshold is exceeded for COUNT consecutive runs, suppressing short spikes and focusing on sustained load. |
| 7 | +**Alerting Logic:** |
8 | 8 |
|
9 | | -Perfdata is emitted for every field to enable full graphing. Extended stats (context switches, interrupts, etc.) are included if supported on this platform. With `--top`, the most CPU-intensive processes are also listed for quick diagnosis. |
| 9 | +* Thresholds apply to: `user`, `system`, `iowait`, and overall `cpu-usage` |
| 10 | +* Alert triggers only when threshold exceeded for `--count` consecutive check runs (default: 5) |
| 11 | +* Example: With default settings and 1-minute check interval, WARN/CRIT states require sustained high CPU for 5 consecutive minutes |
| 12 | +* Single brief spikes are ignored, focusing on persistent performance issues |
10 | 13 |
|
11 | | -This check is cross-platform and works on Linux, Windows, and all psutil-supported systems. |
| 14 | +**Data Collection:** |
12 | 15 |
|
13 | | -Hints and Recommendations: |
| 16 | +* System-wide aggregate CPU statistics (not per-core) |
| 17 | +* Non-blocking measurement using SQLite state persistence between runs |
| 18 | +* Platform-specific extended metrics where available (context switches, interrupts, soft interrupts) |
| 19 | +* Optional top-N CPU-consuming processes (`--top`, default: 5) |
14 | 20 |
|
15 | | -* We check system-wide CPU stats, not per-CPU. |
16 | | -* `--count=5` (the default) while checking every minute means that the check reports a warning if any of `user`, `system`, `iowait` or overall `cpu-usage` was above a threshold in the last 5 minutes. |
| 21 | +**Compatibility:** |
| 22 | + |
| 23 | +* Cross-platform: Linux, Windows, and all psutil-supported systems |
| 24 | +* Uses SQLite database (`$TEMP/linuxfabrik-monitoring-plugins-cpu-usage.db`) for trend tracking |
| 25 | +* Full perfdata output for graphing all metrics in Nagios/Icinga |
17 | 26 |
|
18 | 27 |
|
19 | 28 | ## Fact Sheet |
|
0 commit comments