We develop methods for forming prediction sets in an online setting where the data generating distribution is allowed to vary over time in an unknown fashion. Our framework builds on ideas from conformal inference to provide a general wrapper that can be combined with any black box method that produces point predictions of the unseen label or estimated quantiles of its distribution. While previous conformal inference methods rely on the assumption that the data points are exchangeable, our adaptive approach provably achieves the desired coverage frequency over long-time intervals irrespective of the true data generating process. We accomplish this by modelling the distribution shift as a learning problem in a single parameter whose optimal value is varying over time and must be continuously re-estimated. We test our method, adaptive conformal inference, on two real world datasets and find that its predictions are robust to visible and significant distribution shifts.
翻译:我们开发了在在线环境中形成预测数据集的方法,在这种环境中,数据生成的分布可以以未知的方式随时间而变化。我们的框架建立在来自一致推论的想法之上,以提供一个能够与任何黑盒方法相结合的普通包装器,该黑盒方法能够对无形标签或其分布的四分位进行点预测。虽然先前的一致推论方法依赖于数据点可以互换的假设,但我们的适应性方法可以在长时期内实现预期的覆盖频率,而不论真正的数据生成过程如何。我们通过将分布转移模拟为一个单一参数的学习问题来实现这一点,该参数的最佳价值随时间而变化,并且必须不断重新估算。我们用两个真实的世界数据集测试我们的方法,即适应性一致推论,并发现其预测对于可见的和显著的分布变化是可靠的。