Over the last few decades, various methods have been proposed for estimating prediction intervals in regression settings, including Bayesian methods, ensemble methods, direct interval estimation methods and conformal prediction methods. An important issue is the calibration of these methods: the generated prediction intervals should have a predefined coverage level, without being overly conservative. In this work, we review the above four classes of methods from a conceptual and experimental point of view. Results on benchmark data sets from various domains highlight large fluctuations in performance from one data set to another. These observations can be attributed to the violation of certain assumptions that are inherent to some classes of methods. We illustrate how conformal prediction can be used as a general calibration procedure for methods that deliver poor results without a calibration step.
翻译:在过去几十年里,为估计回归环境中的预测间隔提出了各种方法,包括贝耶斯方法、混合方法、直接间隔估计方法和一致预测方法。一个重要问题是这些方法的校准:产生的预测间隔应该有一个预先界定的覆盖范围,而不应过于保守。在这项工作中,我们从概念和实验的角度来审查上述四类方法。不同领域基准数据集的结果显示,从一组数据到另一组数据,性能波动很大。这些观察可归因于违反某些类别方法所固有的某些假设。我们说明,对于在没有校准步骤的情况下产生不良结果的方法,如何将一致预测用作一般校准程序。