Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence. Finding effective and efficient calibration methods for neural networks is therefore an important endeavour towards better uncertainty quantification in deep learning. In this manuscript, we introduce a novel calibration technique named expectation consistency (EC), consisting of a post-training rescaling of the last layer weights by enforcing that the average validation confidence coincides with the average proportion of correct labels. First, we show that the EC method achieves similar calibration performance to temperature scaling (TS) across different neural network architectures and data sets, all while requiring similar validation samples and computational resources. However, we argue that EC provides a principled method grounded on a Bayesian optimality principle known as the Nishimori identity. Next, we provide an asymptotic characterization of both TS and EC in a synthetic setting and show that their performance crucially depends on the target function. In particular, we discuss examples where EC significantly outperforms TS.
翻译:尽管它们的性能令人难以置信,但据报导深层神经网络对预测信心往往过于乐观。因此,为神经网络找到有效和高效的校准方法,是在深层学习中更好地量化不确定性的一项重要努力。在本手稿中,我们采用了一种叫作期望一致性(EC)的新校准技术,其中包括在培训后调整最后一层重量,通过强制规定平均验证信心与正确标签的平均比例相吻合。首先,我们表明,EC方法在不同神经网络结构和数据集中实现了与温度缩放(TS)相似的校准性能,所有方法都需要类似的验证样本和计算资源。然而,我们争辩说,EC提供了一种基于被称为尼希莫里身份的巴耶斯最佳性原则的有原则性方法。接下来,我们在合成环境中对TS和EC进行了一种无区别的描述,并表明其性能关键取决于目标功能。我们特别讨论了EC在哪些方面明显超越了TS。</s>