To communicate instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions for black-box predictors that control the expected loss on future test points at a user-specified level. Our approach provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets. This framework enables simple, distribution-free, rigorous error control for many tasks, and we demonstrate it in five large-scale machine learning problems: (1) classification problems where some mistakes are more costly than others; (2) multi-label classification, where each observation has multiple associated labels; (3) classification problems where the labels have a hierarchical structure; (4) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (5) protein structure prediction. Lastly, we discuss extensions to uncertainty quantification for ranking, metric learning and distributionally robust learning.
翻译:为了交流预测任务从实例来看的不确定性,我们展示了如何为控制未来测试点预期损失的黑盒预测器制定定值预测,在用户指定水平上控制未来测试点的预期损失。我们的方法为任何数据集提供明确的有限抽样保障,方法是使用一个屏蔽套以校准预测数据集的大小。这个框架为许多任务提供了简单、无分配、严格的错误控制,我们用五个大型机器学习问题来证明这一点:(1) 分类问题,有些错误比其他错误代价高;(2) 多标签分类,每个观察点都有多个相关标签;(3) 分类问题,标签有等级结构;(4) 图像分割,我们希望预测一组含有兴趣对象的像素;(5) 蛋白质结构预测。最后,我们讨论关于定级、计量学习和分布稳健学习的不确定性量化的扩展。