This paper develops a conformal method to compute prediction intervals for non-parametric regression that can automatically adapt to skewed data. Leveraging black-box machine learning algorithms to estimate the conditional distribution of the outcome using histograms, it translates their output into the shortest prediction intervals with approximate conditional coverage. The resulting prediction intervals provably have marginal coverage in finite samples, while asymptotically achieving conditional coverage and optimal length if the black-box model is consistent. Numerical experiments with simulated and real data demonstrate improved performance compared to state-of-the-art alternatives, including conformalized quantile regression and other distributional conformal prediction approaches.
翻译:本文开发了一种一致的方法来计算非参数回归的预测间隔,这种预测间隔可以自动适应偏斜的数据。 利用黑盒机器学习算法来利用直方图估计结果的有条件分布,将其输出转化为最短的预测间隔,其基本有条件覆盖范围大致相同。 由此得出的预测间隔在有限的样本中几乎是边际的,而如果黑盒模型一致,则在条件覆盖和最短长度方面无处不在地实现。 模拟和实际数据的计算实验表明,与最新替代数据相比,其性能有所改善,包括符合要求的孔径回归和其他分布一致的预测方法。