Many methods have been proposed to quantify the predictive uncertainty associated with the outputs of deep neural networks. Among them, ensemble methods often lead to state-of-the-art results, though they require modifications to the training procedures and are computationally costly for both training and inference. In this paper, we propose a new single-model based approach. The main idea is inspired by the observation that we can "simulate" an ensemble of models by drawing from a Gaussian distribution, with a form similar to those from the asymptotic normality theory, infinitesimal Jackknife, Laplacian approximation to Bayesian neural networks, and trajectories in stochastic gradient descents. However, instead of using each model in the "ensemble" to predict and then aggregating their predictions, we integrate the Gaussian distribution and the softmax outputs of the neural networks. We use a mean-field approximation formula to compute this analytically intractable integral. The proposed approach has several appealing properties: it functions as an ensemble without requiring multiple models, and it enables closed-form approximate inference using only the first and second moments of the Gaussian. Empirically, the proposed approach performs competitively when compared to state-of-the-art methods, including deep ensembles, temperature scaling, dropout and Bayesian NNs, on standard uncertainty estimation tasks. It also outperforms many methods on out-of-distribution detection.
翻译:提出了许多方法来量化与深神经网络输出结果相关的预测不确定性。 其中,混合方法往往导致最先进的结果,尽管它们需要修改培训程序,而且对于培训和推论都计算成本很高。 在本文中,我们提出了一个新的基于单一模型的方法。主要想法的灵感来自这样的观察,即我们可以从高山分布中“模拟”一个共同模型,其形式类似于无症状正常理论、无限的杰克基奈夫、向巴伊西亚神经网络的拉普拉基亚近近似以及透析梯度下降的轨迹。然而,我们没有使用“共同”中的每一种模型来预测和随后汇总其预测,而是将高山分布和神经网络的软体输出整合在一起。我们用一种中值近似公式来理解这一分析性精确性综合。 拟议的方法有几处吸引性特性:它作为第一个精度的精度,不要求多种模型,而是向巴伊斯神经网络的近似近似近度,并且只能用高质的模型来进行模拟,在高质模型上进行模拟,在高质模型上进行模拟的演算。