We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees. Our calibration algorithms work with any underlying model and (unknown) data-generating distribution and do not require model refitting. The framework addresses, among other examples, false discovery rate control in multi-label classification, intersection-over-union control in instance segmentation, and the simultaneous control of the type-1 error of outlier detection and confidence set coverage in classification or regression. Our main insight is to reframe the risk-control problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previous literature. We use the framework to provide new calibration methods for several core machine learning tasks, with detailed worked examples in computer vision and tabular medical data.
翻译:我们引入了一个校准机器学习模型的框架,以便它们的预测能满足明确的、有限的统计抽样保证。我们的校准算法与任何基础模型和(已知的)数据生成分布方式合作,不需要重新配置模型。框架除其他外,涉及多标签分类中的虚假发现率控制、例分解中的交叉连接控制,以及同时控制外部检测第1类错误和分类或回归中的信任度设定范围。我们的主要洞察力是重新定义风险控制问题,作为多重假设测试、赋能技术和与以往文献不同的数学参数。我们利用这个框架为一些核心机器学习任务提供新的校准方法,在计算机视觉和表格医疗数据中提供详细的工作实例。