The field of distribution-free predictive inference provides tools for provably valid prediction without any assumptions on the distribution of the data, which can be paired with any regression algorithm to provide accurate and reliable predictive intervals. The guarantees provided by these methods are typically marginal, meaning that predictive accuracy holds on average over both the training data set and the test point that is queried. However, it may be preferable to obtain a stronger guarantee of training-conditional coverage, which would ensure that most draws of the training data set result in accurate predictive accuracy on future test points. This property is known to hold for the split conformal prediction method. In this work, we examine the training-conditional coverage properties of several other distribution-free predictive inference methods, and find that training-conditional coverage is achieved by some methods but is impossible to guarantee without further assumptions for others.
翻译:在无分发的预测性推断领域,在不假定数据分布的情况下,提供可证实有效的预测工具,这种工具可以与任何回归算法相匹配,以提供准确和可靠的预测间隔。这些方法所提供的保障一般是微不足道的,这意味着预测性准确性在培训数据集和所询问的测试点上平均保持着一定的准确性。不过,最好是获得更强有力的培训条件覆盖保障,以确保培训数据集的大多数抽取能够在未来测试点上得出准确的预测性准确性。这一属性众所周知,可以用来维持分离的一致预测方法。在这项工作中,我们检查了其他几种无分发的预测推论方法的培训条件覆盖性能,发现培训条件覆盖是通过某些方法实现的,但如果没有对其它方法的进一步假设,则无法保证培训条件的覆盖性。