Accurate estimation of predictive uncertainty (model calibration) is essential for the safe application of neural networks. Many instances of miscalibration in modern neural networks have been reported, suggesting a trend that newer, more accurate models produce poorly calibrated predictions. Here, we revisit this question for recent state-of-the-art image classification models. We systematically relate model calibration and accuracy, and find that the most recent models, notably those not using convolutions, are among the best calibrated. Trends observed in prior model generations, such as decay of calibration with distribution shift or model size, are less pronounced in recent architectures. We also show that model size and amount of pretraining do not fully explain these differences, suggesting that architecture is a major determinant of calibration properties.
翻译:对预测不确定性的准确估计(模型校准)对于神经网络的安全应用至关重要。许多现代神经网络的错误校准案例已经报告,表明较新的、更精确的模型会产生校准错误的预测。在这里,我们重新审视最近最先进的图像分类模型的这一问题。我们系统地将模型校准和准确性联系起来,发现最新的模型,特别是那些没有使用变异模型的模型,属于最佳校准范围。前几代模型中观察到的趋势,如分布变换或模型大小的校准衰败,在最近的建筑中并不那么明显。我们还表明,模型的规模和训练前数量不能充分解释这些差异,这表明建筑是校准特性的一个主要决定因素。