为何校准错误错误是错误的存在模型不确定性 : 与深学习一起使用 Poster 预测性检查 (Why Calibration Error is Wrong Given Model Uncertainty: Using Posterior Predictive Checks with Deep Learning)

Within the last few years, there has been a move towards using statistical models in conjunction with neural networks with the end goal of being able to better answer the question, "what do our models know?". From this trend, classical metrics such as Prediction Interval Coverage Probability (PICP) and new metrics such as calibration error have entered the general repertoire of model evaluation in order to gain better insight into how the uncertainty of our model compares to reality. One important component of uncertainty modeling is model uncertainty (epistemic uncertainty), a measurement of what the model does and does not know. However, current evaluation techniques tends to conflate model uncertainty with aleatoric uncertainty (irreducible error), leading to incorrect conclusions. In this paper, using posterior predictive checks, we show how calibration error and its variants are almost always incorrect to use given model uncertainty, and further show how this mistake can lead to trust in bad models and mistrust in good models. Though posterior predictive checks has often been used for in-sample evaluation of Bayesian models, we show it still has an important place in the modern deep learning world.

翻译：在过去几年里,人们开始与神经网络一起使用统计模型,最终目标是更好地回答“我们的模型知道什么”的问题。从这一趋势中,古典指标,如预测间覆盖概率(PICP)和校准错误等新指标,进入了模型评价的总系列,以便更好地了解我们模型的不确定性如何与现实相比较。不确定性模型的一个重要部分是模型不确定性(普遍不确定性),这是衡量模型所做和不知道的事情的尺度。然而,目前的评估技术往往将模型不确定性与感知不确定性(可减轻错误)混为一谈,从而得出错误的结论。在本文中,我们利用事后预测检查,表明校准错误及其变量如何几乎总是不正确地使用模型不确定性,并进一步表明这一错误如何导致对坏模型的信任和对好模型的不信任。尽管事后预测性检查经常被用于对巴伊西亚模型进行抽样评估,但我们在现代深层次的学习世界中展示了它的重要位置。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【Thomas G. Dietterich】机器“理解”意味着什么?（What does it mean for a machine to “understand”?）

专知会员服务

9+阅读 · 2020年1月3日

【贝叶斯规则因果推理】《Causal Inference with Bayes Rule》by Finn Lattimore, David Rohde

专知会员服务

46+阅读 · 2019年12月13日