Model calibration, which is concerned with how frequently the model predicts correctly, not only plays a vital part in statistical model design, but also has substantial practical applications, such as optimal decision-making in the real world. However, it has been discovered that modern deep neural networks are generally poorly calibrated due to the overestimation (or underestimation) of predictive confidence, which is closely related to overfitting. In this paper, we propose Annealing Double-Head, a simple-to-implement but highly effective architecture for calibrating the DNN during training. To be precise, we construct an additional calibration head-a shallow neural network that typically has one latent layer-on top of the last latent layer in the normal model to map the logits to the aligned confidence. Furthermore, a simple Annealing technique that dynamically scales the logits by calibration head in training procedure is developed to improve its performance. Under both the in-distribution and distributional shift circumstances, we exhaustively evaluate our Annealing Double-Head architecture on multiple pairs of contemporary DNN architectures and vision and speech datasets. We demonstrate that our method achieves state-of-the-art model calibration performance without post-processing while simultaneously providing comparable predictive accuracy in comparison to other recently proposed calibration methods on a range of learning tasks.
翻译:模型校准与模型预测的频率如何正确有关,它不仅在统计模型设计中发挥着关键作用,而且具有实质性的实际应用,例如,在现实世界中的最佳决策。然而,人们发现,由于高估(或低估)预测信任度的过高估计(或低估),现代深神经网络通常没有很好地校准,这与过度配置密切相关。在本文中,我们建议“Annaaling 双头”是一个在培训期间校准DNN的简单到执行但非常有效的结构。准确地说,我们建造了一个额外的校准头部浅神经网络,通常在正常模型中最后一层潜层上有一个潜伏层,用于绘制日志到一致信任度。此外,正在开发一种简单的“Annailal”技术,通过校准头动态测量其性能,以提高其绩效。在分布和分布变化的情况下,我们详尽地评估了当代DNN的多组建筑和视觉及语音数据组多组的校准结构中我们的“Annailing ” 双头结构结构。我们展示的是,我们的方法在最新的校准性校准其他校准方法中同时提供了最新的比较性校准方法,同时提供了最新的校准其他校准方法。