November 2019
tl;dr: Fuse aleatoric and epistemic uncertainty into bayesian framework.
Very statistical paper. Lots of math details. The implementation in tensorflow probability (tfp) package. It reinterprets the DL object detector as one measurement device, with variance.
The anchor boxes are different measurement devices.
Marginalization over anchor distribution seem to predict epistemic uncertainty
Open-set detection problem (detecting unknown unknowns) is still one area safety-critical applications such as autonomous driving needs to focus on.
- The baseline: loss attenuation
$$L(x) = \frac{1}{2\sigma^2} |y - f(x)| + \frac{1}{2} \log \sigma^2$$ - Note that the uncertainty goes with anchor. (in the original OD, each anchor predicts a bbox. Again, each anchor is a sliding window)
- intelligent robust regression loss, more robust against outliers (noise)
- second term serve as a regularizer
- Bayesian inference to replace NMS:
- Aleatoric uncertainty estimation also includes a term for FP term.
$$L = \frac{1}{N_{po}}\sum_i^{N_{pos}} L(x) + \frac{1}{N_{neg}}\sum_i^{N_{neg}} \frac{1}{\sigma(x)^2}$$ - the second term is found to provide better numerical stability (but this did not transfer well to TF2.0 as claimed by the author)
- Black box (MC drop out) method has little discriminative power between TP and FP, mainly because it observes after NMS
- Sampling free (aleatoric regression) method are faster than black box ones (one forward pass instead of multiple). The authors say they provides worse categorical discrimination, but Fig. 3 seems to be contradicting this --> why?
- The authors used GMUE instead of Total variance (trace of the covar matrix). --> why?
- The number of object instances in the scene is unknown a priori --> maybe this is one area DL is better than deterministic optimization algorithms such as MUSIC?