In Bayesian Deep Learning, distributions over the output of classification neural networks are often approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the softmax outputs. This is costly. We reconsider old work (Laplace Bridge) to construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the Categorical distribution) in the output space. Importantly, the vanilla Laplace Bridge comes with certain limitations. We analyze those and suggest a simple solution that compares favorably to other commonly used estimates of the softmax-Gaussian integral. We demonstrate that the resulting Dirichlet distribution has multiple advantages, in particular, more efficient computation of the uncertainty estimate and scaling to large datasets and networks like ImageNet and DenseNet. We further demonstrate the usefulness of this Dirichlet approximation by using it to construct a lightweight uncertainty-aware output ranking for ImageNet.
翻译:在Bayesian Deep Learning中,对分类神经网络输出的分布往往通过首先对重量进行高斯分布,然后从中取样,以获得软负输出的分布。这是昂贵的。我们重新考虑旧的工程(Laplace Bridge),以构建软负输出分布的Drichlet近似值,从而产生高斯分布在logit空间和Dirichlet分布在输出空间(分类分布之前的共产体)之间的分析图。重要的是,香草拉普尔桥有一定的局限性。我们分析这些缺陷并提出一个简单的解决办法,比软负负负-Gaussian组件的其他常用估计值要好。我们证明,由此形成的Drichlet分布具有多种优势,特别是更高效地计算不确定性估计和向图像网和DenseNet等大型数据集和网络的缩放。我们进一步展示了这种Drichlet近值的有用性,方法是利用它来构建一个轻度的不确定性-觉察觉图像网输出排名。