Modern deep learning models are typically unable to deal with uncertainties, for instance in classification or regression tasks. These uncertainties can be estimated via Bayesian Deep Learning, resulting in powerful tools which are able to capture aleatoric and epistemic uncertainties:
- aleatoric uncertainties: noise inherent in the data, which cannot be reduced by increasing the dataset size;
- epistemic uncertainties: uncertainty in the model parameters, and therefore can be minimised given enough data.
The method presented was first developed in arXiv:1703.04977, and translated into a particle physics applications in arXiv:2003.11099 and arxiv:2104.04543. In order to capture epistemic uncertainty in a neural network we can place a prior over its parameters using Bayesian inference to estimate the posterior .
In order to make the posterior tractable for a neural network we employ variational inference, approximating the true posterior with a mean-field Gaussian. For a regression task, using a Gaussian likelihood, and employing a model which outputs the predictive mean and variance
we get the minimisation objective
plus the KL divergence w.r.t the variational posterior.
We provide a few example notebook to illustrate the technique.