Conformal inference has played a pivotal role in providing uncertainty quantification for black-box ML prediction algorithms with finite sample guarantees. Traditionally, conformal prediction inference requires a data-independent specification of miscoverage level. In practical applications, one might want to update the miscoverage level after computing the prediction set. For example, in the context of binary classification, the analyst might start with a $95\%$ prediction sets and see that most prediction sets contain all outcome classes. Prediction sets with both classes being undesirable, the analyst might desire to consider, say $80\%$ prediction set. Construction of prediction sets that guarantee coverage with data-dependent miscoverage level can be considered as a post-selection inference problem. In this work, we develop uniform conformal inference with finite sample prediction guarantee with arbitrary data-dependent miscoverage levels using distribution-free confidence bands for distribution functions. This allows practitioners to trade freely coverage probability for the quality of the prediction set by any criterion of their choice (say size of prediction set) while maintaining the finite sample guarantees similar to traditional conformal inference.
翻译:遵循统一的预测方法和数据独立规范进行的一致性推断已经为具有有限样本保证的黑盒机器学习(ML)预测算法提供了不确定性量化的关键作用。然而,在实际应用中,人们可能需要在计算预测集后更新不确定性量化。例如,在二元分类的情况下,分析师可能会从 $95\% $ 的预测集开始,并发现大多数预测集都包含了所有的结果类。当所有结果类均不可取时,分析师可能希望考虑一个 $80\% $ 的预测集。构建预测集以保证具有数据依赖性的不确定性量化可以被视为后选择推断问题。在本研究中,我们利用对分布函数的分布自由置信区间,开发了具有任意数据依赖性不确定性度量的统一一致性推断,从而使从业者可以自由地按照任何自己选择的标准(如预测集大小)以任意覆盖概率交换覆盖概率与预测集质量,同时保持与传统一致性推断类似的有限样本保证。