Conformal predictors are machine learning algorithms that output prediction intervals that have a guarantee of marginal validity for finite samples with minimal distributional assumptions. This is a property that makes conformal predictors useful for machine learning tasks where we require reliable predictions. It would also be desirable to achieve conditional validity in the same setting, in the sense that validity of the prediction intervals remains valid regardless of conditioning on any property of the object of the prediction. Unfortunately, it has been shown that such conditional validity is impossible to guarantee for non-trivial prediction problems for finite samples. In this article, instead of trying to achieve a strong conditional validity result, the weaker goal of achieving an approximation to conditional validity is considered. A new algorithm is introduced to do this by iteratively adjusting a conformity measure to deviations from object conditional validity measured in the training data. Along with some theoretical results, experimental results are provided for three data sets that demonstrate (1) in real world machine learning tasks, lack of conditional validity is a measurable problem and (2) that the proposed algorithm is effective at alleviating this problem.
翻译:复合预测器是一种机器学习算法,其输出预测间隔保证了有限样本的边际有效性,且分布假设最小。这是一种属性,它使得符合预测器在需要可靠预测的情况下对机器学习任务有用。在同一环境中实现有条件的有效性也是可取的,因为预测间隔的有效性无论对预测对象的任何属性是否附加条件,都仍然有效。不幸的是,已经证明这种有条件的有效性无法保证对有限样本的非三重预测问题。在本条中,与其试图取得强烈的有条件有效性结果,不如考虑实现近似于有条件有效性的较弱目标。引入一种新的算法是为了做到这一点,办法是通过迭接调整符合性衡量标准,使其偏离培训数据中测量的物体的有条件有效性。除了一些理论结果外,还为三个数据集提供了实验结果,这些数据集表明:(1) 在真正的世界机器学习任务中,缺乏有条件的有效性是一个可衡量的问题,(2) 拟议的算法有效地缓解了这一问题。