In Bayesian Deep Learning, distributions over the output of classification neural networks are approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the categorical output distribution. This is costly. We reconsider old work to construct a Dirichlet approximation of this output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the categorical) in the output space. We argue that the resulting Dirichlet distribution has theoretical and practical advantages, in particular more efficient computation of the uncertainty estimate, scaling to large datasets and networks like ImageNet and DenseNet. We demonstrate the use of this Dirichlet approximation by using it to construct a lightweight uncertainty-aware output ranking for the ImageNet setup.
翻译:在Bayesian Deep Learning中,对分类神经网络输出的分布大致接近于先构建高斯对重量的分布,然后从中取样,以获得对绝对输出分布的分布。这是昂贵的。我们重新考虑了为构建Drichlet对这种输出分布的近似值而做的老工作,它生成了高斯在逻辑空间和Drichlet在输出空间的分布(直径之前的构思)之间的分析图。我们争论说,由此产生的Drichlet分布具有理论和实践上的优势,特别是更高效地计算不确定性的估算值,向大型数据集和网络(如图像网络和DenseNet)进行缩放。我们用Drichlet的近似值来为图像网络的设置构建一个轻量的不确定性-认知输出排位。我们展示了Drichlet对图像网络的利用。