Modern deep learning based classifiers show very high accuracy on test data but this does not provide sufficient guarantees for safe deployment, especially in high-stake AI applications such as medical diagnosis. Usually, predictions are obtained without a reliable uncertainty estimate or a formal guarantee. Conformal prediction (CP) addresses these issues by using the classifier's predictions, e.g., its probability estimates, to predict confidence sets containing the true class with a user-specified probability. However, using CP as a separate processing step after training prevents the underlying model from adapting to the prediction of confidence sets. Thus, this paper explores strategies to differentiate through CP during training with the goal of training model with the conformal wrapper end-to-end. In our approach, conformal training (ConfTr), we specifically "simulate" conformalization on mini-batches during training. Compared to standard training, ConfTr reduces the average confidence set size (inefficiency) of state-of-the-art CP methods applied after training. Moreover, it allows to "shape" the confidence sets predicted at test time, which is difficult for standard CP. On experiments with several datasets, we show ConfTr can influence how inefficiency is distributed across classes, or guide the composition of confidence sets in terms of the included classes, while retaining the guarantees offered by CP.
翻译:现代深层学习分类显示,测试数据的精确度很高,但这并不能为安全部署提供足够的保障,特别是在医疗诊断等高风险AI应用中。通常,在没有可靠的不确定性估计或正式保证的情况下获得预测。非正式预测(CP)利用分类预测(例如其概率估计)来解决这些问题,用用户指定的概率来预测包含真实等级的信任套件。然而,在培训后,将CP作为单独的处理步骤,使基本模式无法适应信心套件的预测。因此,本文件探讨在培训期间通过CP进行区分的战略,以培训模式为目的,采用符合要求的包装终端到终端。在我们的方法中,符合要求的培训(ConfTr),我们在培训期间特别“模拟”微型格子上的合规化。与标准培训相比, ConfTr 降低了培训后应用的最新CP方法的平均信任套件(效率) 。此外,它允许在测试时将预测的信任套件分成不同的测试套件,这对标准包装包件最终到最后端端端端格来说是困难的。在标准CP的等级中,我们很难在一系列的实验中展示了标准级中,同时提供的信心套件的保值。