Skip to content

Threshold based metrics

Thomas Nipen edited this page Jan 26, 2018 · 13 revisions

Threshold-based metrics evaluate forecasts based on their ability to predict the exceedance or non-exceedance of a threshold. For example for a threshold of 20 mm, observations and forecasts can be categorized into hit, false alarm, miss, and correct rejection:

The thresholding creates a contingency table with values of a, b, c, and d. In general, better forecasts have more hits (a) and correct rejections (d) and fewer false alarms (b) and misses (d).

Numerous metrics exist that use the values of a, b, c, and d. Commonly used ones are the threat score (-m threat) the equitable threat score (-m ets), proportions correct (-m pc), symmetric extreme dependency score (-m seds), but there are many more supported in Verif. We will look at the threat score, which rewards hits and correct rejections and penalizes false alarms and misses. It is given by the equation a / (a + b + c).

verif raw.nc cal.nc -m threat

This shows the threat score as the threshold is varied. Notice that the default x-axis is threshold, which is always the case for metrics that use the contingency table. The thresholds used in the figure are automatically selected, but can be specified by using the -r flag:

verif raw.nc cal.nc -m threat -r 0:30

If a different axis is specified, then verif shows the average threat score across all thresholds, for example:

verif raw.nc cal.nc -m threat -r 0:30 -x leadtime

The same is true if the score is shown on a map. Note that if any of the thresholds yield undefined values (for example if if the denominator in the threat score calculation is 0), then the average will also be undefined.

Interval types

For these scores the default is to define an event as exceeding a threshold (X > threshold). The -b option can be used to define how events are defined. Using -b below means that event occurs if the observation is below the threshold (effectively interchanging hit with correct rejection and false alarm with miss in the image above). Other options are -b below= (X <= threshold) and -b above= (X <= threshold). The default is -b above.

For metrics that have multiple thresholds, -b within forces Verif to consider the region between each consecutive pair as an event. Notice that the points on the graphs are now plotted in the middle of each bin, instead of at each threshold. -b =within means (lower <= X < upper) and -b within= and -b =within= are also defined.

verif raw.nc cal.nc -m threat -r 0:30 -b within