Out-of-Distribution (OOD) detection, i.e., identifying whether an input is sampled from a novel distribution other than the training distribution, is a critical task for safely deploying machine learning systems in the open world. Recently, post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems. This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods? In this work, we propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models. Our approach uses the p-value instead of the commonly used hard threshold and leverages a fundamental framework of multiple hypothesis testing to control the true positive rate of In-Distribution (ID) data. We focus on the usage of model zoos and provide systematic empirical comparisons with current state-of-the-art methods on various OOD detection benchmarks. The proposed ensemble scheme shows consistent improvement compared to single-model detectors and significantly outperforms the current competitive methods. Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.
翻译:外移检测(OOD), 即确定投入是否来自培训分布以外的新版本分布的样本, 是开放世界安全部署机器学习系统的关键任务。 最近, 利用预先培训的模型进行临时检测, 显示了有希望的绩效, 并可以推广到大规模问题 。 这一进步自然提出了一个问题: 我们能否利用多种预先培训的模型的多样性来改进临时检测方法的性能? 在这项工作中, 我们建议一种检测强化方法, 将来自预先培训模型的动物园的多项检测决定综合起来。 我们的方法使用p值, 而不是常用的硬阈值, 并运用多种假设测试的基本框架来控制真正正率的分布(ID)数据。 我们侧重于模型动物园的使用, 并提供系统的经验性比较, 与当前各种 OOD 检测基准的最新方法进行比较。 拟议的混合计划显示, 与单一模型检测器相比, 持续改进, 大大超出当前竞争方法。 我们的方法大大改进了65.40% 和26.96 % CIFAR10 和图像基准的相对性业绩。