Due to the intractability of characterizing everything that looks unlike the normal data, anomaly detection (AD) is traditionally treated as an unsupervised problem utilizing only normal samples. However, it has recently been found that unsupervised image AD can be drastically improved through the utilization of huge corpora of random images to represent anomalousness; a technique which is known as Outlier Exposure. In this paper we show that specialized AD learning methods seem unnecessary for state-of-the-art performance, and furthermore one can achieve strong performance with just a small collection of Outlier Exposure data, contradicting common assumptions in the field of AD. We find that standard classifiers and semi-supervised one-class methods trained to discern between normal samples and relatively few random natural images are able to outperform the current state of the art on an established AD benchmark with ImageNet. Further experiments reveal that even one well-chosen outlier sample is sufficient to achieve decent performance on this benchmark (79.3% AUC). We investigate this phenomenon and find that one-class methods are more robust to the choice of training outliers, indicating that there are scenarios where these are still more useful than standard classifiers. Additionally, we include experiments that delineate the scenarios where our results hold. Lastly, no training samples are necessary when one uses the representations learned by CLIP, a recent foundation model, which achieves state-of-the-art AD results on CIFAR-10 and ImageNet in a zero-shot setting.
翻译:由于与正常数据不同的一切特征的模糊性,异常检测(AD)传统上被视为一个没有监督的问题,仅使用正常的样本。然而,最近发现无监督的图像AD可以通过使用巨大的随机图像团团体代表异常性而得到大幅改进;这种技术被称为外部暴露。在本文中,我们表明,专用的AD学习方法似乎对于最先进的性能来说是不必要的,而且,只要收集少量的外部暴露数据,就能够取得强效,这与在AD领域常见的假设相矛盾。我们发现,经过培训,能够辨别正常样本与相对较少随机自然图像之间的标准分类和半超前一等方法,能够超越现有AD基准中的艺术状态。进一步的实验显示,即使是一个精选的外部样本也足以在这个基准(79.3%的AUC)上取得体面的性能。我们调查了这一现象,发现单级方法对于选择培训外端的外端点来说更为可靠,我们发现,我们发现在正常样本中也发现了一些情景,这些是比我们所了解的标准模型更有用的模型更有用的模型。