无人监督的对分配外探测评估:以数据为中心的视角 (Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric Perspective)

Out-of-distribution (OOD) detection methods assume that they have test ground truths, i.e., whether individual test samples are in-distribution (IND) or OOD. However, in the real world, we do not always have such ground truths, and thus do not know which sample is correctly detected and cannot compute the metric like AUROC to evaluate the performance of different OOD detection methods. In this paper, we are the first to introduce the unsupervised evaluation problem in OOD detection, which aims to evaluate OOD detection methods in real-world changing environments without OOD labels. We propose three methods to compute Gscore as an unsupervised indicator of OOD detection performance. We further introduce a new benchmark Gbench, which has 200 real-world OOD datasets of various label spaces to train and evaluate our method. Through experiments, we find a strong quantitative correlation betwwen Gscore and the OOD detection performance. Extensive experiments demonstrate that our Gscore achieves state-of-the-art performance. Gscore also generalizes well with different IND/OOD datasets, OOD detection methods, backbones and dataset sizes. We further provide interesting analyses of the effects of backbones and IND/OOD datasets on OOD detection performance. The data and code will be available.

翻译：分解(OOD)检测方法假定它们具有测试地面真实性,即,单个测试样品是分布式(IND)还是OOD。然而,在现实世界中,我们并不总是有这样的地面真实性,因此我们不知道哪些样本是正确检测的,因此不能像AUROC那样计算测量指标来评价不同OOD检测方法的性能。在本文中,我们首先在OOD检测中引入了不受监督的评估问题,目的是在没有OOOD标签的实际情况变化环境中评价OOOD检测方法。我们提出了三种计算Gscore作为OD检测性能不受监督的指标的方法。我们进一步引入了一个新的基准Gbench,该基准有200个各种标签空间的真实的OODD数据集,用于培训和评估我们的方法。我们通过实验发现一个很强的定量相关性Betwen Gscore 和OD检测性能。广泛的实验表明,我们的Gsco将达到最新性能。Gsco还把Gscrequenation Gros 与不同的INND/OD 数据基数分析。

相关内容

Performance

关注 3

Performance：International Symposium on Computer Performance Modeling, Measurements and Evaluation。 Explanation：计算机性能建模、测量和评估国际研讨会。 Publisher：ACM。 SIT：http://dblp.uni-trier.de/db/conf/performance/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日