Modern neural networks have found to be miscalibrated in terms of confidence calibration, i.e., their predicted confidence scores do not reflect the observed accuracy or precision. Recent work has introduced methods for post-hoc confidence calibration for classification as well as for object detection to address this issue. Especially in safety critical applications, it is crucial to obtain a reliable self-assessment of a model. But what if the calibration method itself is uncertain, e.g., due to an insufficient knowledge base? We introduce Bayesian confidence calibration - a framework to obtain calibrated confidence estimates in conjunction with an uncertainty of the calibration method. Commonly, Bayesian neural networks (BNN) are used to indicate a network's uncertainty about a certain prediction. BNNs are interpreted as neural networks that use distributions instead of weights for inference. We transfer this idea of using distributions to confidence calibration. For this purpose, we use stochastic variational inference to build a calibration mapping that outputs a probability distribution rather than a single calibrated estimate. Using this approach, we achieve state-of-the-art calibration performance for object detection calibration. Finally, we show that this additional type of uncertainty can be used as a sufficient criterion for covariate shift detection. All code is open source and available at https://github.com/EFS-OpenSource/calibration-framework.
翻译:现代神经网络发现信任度校准错误, 也就是说, 它们的预测信任度并不反映观察到的准确度或精确度。 最近的工作引入了用于分类和物体探测的热度后信任度校准方法, 以解决这一问题。 特别是在安全关键应用中, 获得对模型的可靠自我评估至关重要。 但是, 如果校准方法本身不确定, 例如, 由于缺乏知识基础? 我们引入了贝叶斯人信心校准 — 一个框架, 以结合校准方法的不确定性获得校准信任度估计。 通常, 巴伊斯人神经网络( BNNN) 用于显示网络对某种预测的不确定性。 BNNS 被解读为使用分布而不是重量来推断的神经网络。 我们把使用分配的理念转移到信任度校准校准。 为此, 我们使用随机的变异性推断来构建校准测算图, 输出概率分布, 而不是单一校准估计值。 我们使用这个方法, 实现州- 版校准/ 校准系统类型校准标准是用于检测的完整校准源。 最后, 校准校准校准系统校准校准/ 显示所有校准校准校准校准校准的校准标准的校准标准的校准标准校准标准, 。