|
2 | 2 |
|
3 | 3 | ## Warning
|
4 | 4 |
|
5 |
| -Spikekill is designed to overwrite specific data points in one or more RRD files. You should use great caution when running Spikekill, and you should always ensure you have proper backups from which to restore your RRD files if Spikekill behaves in a way you did not expect. |
| 5 | +Spikekill is designed to overwrite specific data points in one or more RRD |
| 6 | +files. You should use great caution when running Spikekill, and you should |
| 7 | +always ensure you have proper backups from which to restore your RRD files if |
| 8 | +Spikekill performs a modification that you did not expect. |
6 | 9 |
|
7 | 10 | ## Overview
|
8 | 11 |
|
9 |
| -Spikekill is a tool used to remove spikes in a graph. Spikes can appear in a graph after a device reboots, or when you switch from 32-bit to 64-bit interface counters on a device. Spikekill works by statistically analyzing the data contained inside an RRD file, and overwriting specific data points. It offers 4 methods by which to analyze the data and selectively overwrite data points: |
10 |
| -1. Standard Deviation - Calulate the [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of the data, and overwrite data points that are _N_ times higher than the standard deviation |
11 |
| -1. Variance Average - Calculate the average value of the data, and overwrite data points that are _N_ percent higher than the average |
12 |
| -1. GapFill - Find gaps in the data (missing data points) and fill them in with the average value |
13 |
| -1. Float - Calculate the average value of the data, and then overwrite all data points that are within a specified time range |
| 12 | +Spikekill is a tool used to remove spikes in a graph. Spikes can appear in a |
| 13 | +graph after a device reboots, or when you switch from 32-bit to 64-bit interface |
| 14 | +counters on a device. Spikekill works by statistically analyzing the data |
| 15 | +contained inside an RRD file, and overwriting specific data points. It offers |
| 16 | +4 methods by which to analyze the data and selectively overwrite data points: |
| 17 | + |
| 18 | +1. _Standard Deviation_ |
| 19 | + |
| 20 | + Calulates the [standard |
| 21 | + deviation](https://en.wikipedia.org/wiki/Standard_deviation) of the data, and |
| 22 | + overwrites data points that are _N_ times higher than the standard deviation. |
| 23 | + |
| 24 | +2. _Variance Average_ |
| 25 | + |
| 26 | + Calculates the average value of the data, and overwrites data points that are |
| 27 | + _N_ percent higher than the average. |
| 28 | + |
| 29 | +3. _GapFill_ |
| 30 | + |
| 31 | + Finds gaps in the data (missing data points) and also finds data points that |
| 32 | + are _N_ percent higher than the average, and overwrites them, but only does |
| 33 | + so inside the specified time range. |
| 34 | + |
| 35 | +4. _Float_ |
| 36 | + |
| 37 | + Overwrites all data points that are within a specified time range. |
14 | 38 |
|
15 | 39 | ## How to use
|
16 |
| -Spikekill is easily run on a graph with a couple mouse clicks: |
17 |
| -1. On Cacti's main `Graph` tab, click the Spikekill icon  next to a graph. |
18 |
| -1. In the drop-down menu that appears, review your current settings by hovering on `Settings` and reviewing each item, making changes if desired. |
19 |
| -1. Run Spikekill by choosing one of the four methods available in the drop-down menu. Spikekill runs immediately, and the graph may be modified, depending on your settings. The graph is refreshed when Spikekill is finished. |
| 40 | + |
| 41 | +### GUI |
| 42 | + |
| 43 | +Spikekill is easily run on a graph with a couple mouse clicks in the Cacti GUI: |
| 44 | + |
| 45 | +1. On Cacti's main `Graph` tab, click the Spikekill icon |
| 46 | +  next to a graph. |
| 47 | + |
| 48 | +2. In the drop-down menu that appears, review your current settings by hovering |
| 49 | + on `Settings` and reviewing each item, making changes if desired. |
| 50 | + |
| 51 | +3. Run Spikekill by choosing one of the four methods available in the drop-down |
| 52 | + menu. Spikekill runs immediately, and the graph may be modified, depending on |
| 53 | + your settings. The graph is refreshed when Spikekill is finished. |
| 54 | + |
| 55 | +### CLI |
| 56 | + |
| 57 | +Spikekill also has a flexible command line interface. The following example |
| 58 | +will help you get started. |
| 59 | + |
| 60 | +```console |
| 61 | +shell>php cli/removespikes.php --help |
| 62 | +``` |
20 | 63 |
|
21 | 64 | ## Settings
|
22 |
| -Spikekill requires certain values to successfully calculate which data points to overwrite. The following settings are customizable: |
23 |
| -1. Replacement Method - When Spikekill identifies data points to overwrite, the value of _Replacement Method_ represents the data that will be written in place of each data point |
24 |
| -1. Standard Deviations - When using the Standard Deviation method, the value of _Standard Deviations_ is the coefficient which determines how many times above the standard deviation the data must be to be considered a spike |
25 |
| -1. Variance Percentage - When using the Variance Average method, the value of _Variance Percentage_ is the coefficient which determines how much higher than the average the data must be to be considered a spike |
26 |
| -1. Variance Outliers - When using the Variance Average method, the highest _N_ values and the lowest _N_ values are considered _outliers_. These outliers are ignored when calculating the average of the data |
27 |
| -1. Kills Per RRA - Spikekill will limit the number of data points that it overwrites to the value of _Kills Per RRA_. For reference, a single RRD file can contain multiple _data sources_ and multiple _archives_. Therefore, the total number of data points that could be overwritten in a single RRD file will be determined by the equation: ```[number_of_RRAs] * [number_of_data_sources] * [Kills_Per_RRA]``` |
28 |
| - |
29 |
| -The above settings are maintatined individually for each Cacti user. The default values can be changed globally in the Cacti settings, found at Configuration > Settings > Spikes. A user can change his or her individual settings by selecting different choices in the Spikekill menu which appears next to a graph. |
| 65 | + |
| 66 | +Spikekill requires certain values to successfully calculate which data points to |
| 67 | +overwrite. The following settings are customizable: |
| 68 | + |
| 69 | +1. _Replacement Method_ |
| 70 | + |
| 71 | + When Spikekill identifies data points to overwrite, the |
| 72 | + value of _Replacement Method_ represents the data that will be written in |
| 73 | + place of each data point. |
| 74 | + |
| 75 | + When _Replacement Method_ is set to `NaN`, running Spikekill using `GapFill` |
| 76 | + or `Float` methods will perform no modifications. Please be aware of [Issue |
| 77 | + #3673](https://github.com/Cacti/cacti/issues/3673), where 'Last Known Good' |
| 78 | + is always used, regardless of setting. |
| 79 | + |
| 80 | +2. _Standard Deviations_ |
| 81 | + |
| 82 | + When using the Standard Deviation method, the value of _Standard Deviations_ |
| 83 | + is the coefficient which determines how many times above the standard |
| 84 | + deviation the data must be to be considered a spike. |
| 85 | + |
| 86 | +3. _Variance Percentage_ |
| 87 | + |
| 88 | + When using the Variance Average method, the value of _Variance Percentage_ is |
| 89 | + the coefficient which determines how much higher than the average the data |
| 90 | + must be to be considered a spike. |
| 91 | + |
| 92 | +4. _Variance Outliers_ |
| 93 | + |
| 94 | + When using the Variance Average method, the highest _N_ values and the lowest |
| 95 | + _N_ values are considered _outliers_. These outliers are ignored when |
| 96 | + calculating the average of the data. |
| 97 | + |
| 98 | +5. _Kills Per RRA_ |
| 99 | + |
| 100 | + Spikekill will limit the number of data points that it |
| 101 | + overwrites to the value of _Kills Per RRA_. For reference, a single RRD file |
| 102 | + can contain multiple _data sources_ and multiple _archives_. Therefore, the |
| 103 | + total number of data points that could be overwritten in a single RRD file |
| 104 | + will be determined by the formula: |
| 105 | + |
| 106 | + ```console |
| 107 | + (number_of_RRAs) x (number_of_data_sources) x (Kills_Per_RRA) |
| 108 | + ``` |
| 109 | + |
| 110 | + This only applies when using the `Standard Deviation` or `Variance Average` |
| 111 | + methods, and does not apply when using the `GapFill` or `Float` methods. |
| 112 | + |
| 113 | +The above settings are maintatined individually for each Cacti user. The |
| 114 | +default values can be changed globally in the Cacti settings, found at |
| 115 | +Configuration > Settings > Spikes. A user can change her or his individual |
| 116 | +settings by selecting different choices in the Spikekill menu which appears |
| 117 | +next to a graph. |
| 118 | + |
| 119 | +## Detailed Operation |
| 120 | + |
| 121 | +### Independence between each RRA:DS pair |
| 122 | + |
| 123 | +When Spikekill runs, it analyzes each data source (DS) of each round-robin |
| 124 | +archive (RRA) independently. During analysis, it calculates values for |
| 125 | +`average` and `standard deviation` for each RRA:DS pair. |
| 126 | + |
| 127 | +To understand this better, consider the following example. |
| 128 | + |
| 129 | +- You have a round-robin database (RRD) file that stores traffic levels on a |
| 130 | + router interface. |
| 131 | + |
| 132 | +- You have two DSes, which are `traffic_in` and `traffic_out`. |
| 133 | + |
| 134 | +- You also have selected three profiles, which are 1 minute for 7 days, 15 |
| 135 | + minutes for 5 weeks, and 1 hour for 3 years. |
| 136 | + |
| 137 | +- Lastly, you have three consolidation functions (CF), which are `average`, |
| 138 | + `min`, and `max`. |
| 139 | + |
| 140 | +In this example, you have 9 RRAs (3 profiles multiplied by 3 CFs). Therefore, |
| 141 | +Spikekill will calculate 18 averages and 18 standard deviation values (9 RRAs |
| 142 | +multiplied by 2 DSes). Data points for each RRA:DS pair will be compared |
| 143 | +against their respective `average` and `standard deviation` values. |
| 144 | + |
| 145 | +### Search order |
| 146 | + |
| 147 | +Spikekill searches for spikes in the same order as `rrddump` exports data, |
| 148 | +which is sequentially by RRA, from oldest to newest data points within each |
| 149 | +RRA. Data points are overwritten if they exceed the maximum allowed threshold |
| 150 | +or if they do not reach the minimum required threshold. |
| 151 | + |
| 152 | +Spikekill will cease overwriting data points for any individual RRA when it has |
| 153 | +reached the value of _Kills Per RRA_, counting all DSes of that RRA together. |
| 154 | +It will then begin the process again on the next RRA. |
| 155 | + |
| 156 | +This means that for any given RRA, older spikes will always be overwritten |
| 157 | +before newer spikes. However, newer spikes in one RRA can still be overwritten |
| 158 | +before older spikes in a different RRA, since the thresholds are different per |
| 159 | +RRA:DS pair. |
30 | 160 |
|
31 | 161 | ---
|
32 | 162 | Copyright (c) 2004-2020 The Cacti Group
|
0 commit comments