We introduce Learn then Test, a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees regardless of the underlying model and (unknown) data-generating distribution. The framework addresses, among other examples, false discovery rate control in multi-label classification, intersection-over-union control in instance segmentation, and the simultaneous control of the type-1 error of outlier detection and confidence set coverage in classification or regression. To accomplish this, we solve a key technical challenge: the control of arbitrary risks that are not necessarily monotonic. Our main insight is to reframe the risk-control problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previous literature. We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision.
翻译:我们引入了“学习后测试”这个校准机器学习模型的框架,以便它们的预测能满足明确的、有限的统计保证,而不论基本模型和(已知的)数据生成分布。这个框架除其他外,涉及多标签分类中的虚假发现率控制、例分化中的交叉连接控制,以及同时控制外部检测的1型差错和分类或回归中的信任范围。为了实现这一目标,我们解决了一个关键的技术挑战:控制不一定是单调的任意风险。我们的主要见解是重新界定风险控制问题,将它描述为多重假设测试、赋能技术和数学参数,不同于以前的文献。我们利用我们的框架为计算机愿景中的详细工作实例的几种核心机器学习任务提供新的校准方法。