Neural networks solving real-world problems are often required not only to make accurate predictions but also to provide a confidence level in the forecast. The calibration of a model indicates how close the estimated confidence is to the true probability. This paper presents a survey of confidence calibration problems in the context of neural networks and provides an empirical comparison of calibration methods. We analyze problem statement, calibration definitions, and different approaches to evaluation: visualizations and scalar measures that estimate whether the model is well-calibrated. We review modern calibration techniques: based on post-processing or requiring changes in training. Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
翻译:神经网络广泛应用于实际问题,要求其不仅能够准确预测,还需要提供预测的置信度。模型的校准度表示模型预测的置信度与实际概率的接近程度。本文探讨了神经网络的校准问题,并提供了校准方法的经验比较。我们分析了问题陈述、校准定义和不同评估方法:可视化和估算模型校准性的标量测量。我们回顾了现代校准技术:基于后处理或需要改变训练。实证实验涵盖了各种数据集和模型,根据不同的标准比较校准方法。