Recent works demonstrate that early layers in a neural network contain useful information for prediction. Inspired by this, we show that extending temperature scaling across all layers improves both calibration and accuracy. We call this procedure "layer-stack temperature scaling" (LATES). Informally, LATES grants each layer a weighted vote during inference. We evaluate it on five popular convolutional neural network architectures both in- and out-of-distribution and observe a consistent improvement over temperature scaling in terms of accuracy, calibration, and AUC. All conclusions are supported by comprehensive statistical analyses. Since LATES neither retrains the architecture nor introduces many more parameters, its advantages can be reaped without requiring additional data beyond what is used in temperature scaling. Finally, we show that combining LATES with Monte Carlo Dropout matches state-of-the-art results on CIFAR10/100.
翻译:最近的工作表明,神经网络中的早期层含有有用的预测信息。 受此启发, 我们显示, 将温度缩放扩展到所有层都能够改善校准和准确性。 我们称这个程序为“ 层堆温缩放”(LATES ) 。 非正式地, LATES 在推论期间给予每个层加权票。 我们评估了五种流行的进化神经网络结构的分布和外分布结构,并观察到温度缩放在准确性、校准和AUC 方面不断改善。 所有结论都得到了全面统计分析的支持。 由于LATES既不重新研究建筑,也不引入更多的参数,因此可以不要求温度缩放之外的额外数据来获取它的优势。 最后, 我们显示, 将LATES与蒙特卡洛漏放相匹配了 CIFAR10100 的最新结果。