Conformal inference has played a pivotal role in providing uncertainty quantification for black-box ML prediction algorithms with finite sample guarantees. Traditionally, conformal prediction inference requires a data-independent specification of miscoverage level. In practical applications, one might want to update the miscoverage level after computing the prediction set. For example, in the context of binary classification, the analyst might start with a $95\%$ prediction sets and see that most prediction sets contain all outcome classes. Prediction sets with both classes being undesirable, the analyst might desire to consider, say $80\%$ prediction set. Construction of prediction sets that guarantee coverage with data-dependent miscoverage level can be considered as a post-selection inference problem. In this work, we develop uniform conformal inference with finite sample prediction guarantee with arbitrary data-dependent miscoverage levels using distribution-free confidence bands for distribution functions. This allows practitioners to trade freely coverage probability for the quality of the prediction set by any criterion of their choice (say size of prediction set) while maintaining the finite sample guarantees similar to traditional conformal inference.
翻译:----
在黑盒机器学习预测算法提供有限样本保证的不确定性量化方面,一致性推断发挥了关键作用。传统上,一致性预测推断需要数据独立的误覆盖水平的规定。在实际应用中,人们可能希望在计算预测集之后更新误覆盖水平。例如,在二元分类的情况下,分析员可能开始使用 95% 预测集,并发现大多数预测集都包含了所有结果类别。由于两个类别都不理想,分析师可能希望考虑例如 80% 预测集。构建在任何自己选择的标准(例如预测集大小)下保证数据相关误覆盖水平的预测集,可以被看作是后选择推断问题。在本文中,我们使用分布自由的置信带来开发具有任意数据相关误覆盖水平的统一一致推断,以用于有限样本预测保证。这使得从业者可以在维持类似于传统一致推断的有限样本保证的同时,根据自己选择的任何标准(例如预测集大小)自由交换覆盖概率和预测集质量。