In this work we propose a non-contrastive method for anomaly detection and segmentation in images, that benefits both from a modern machine learning approach and a more classic statistical detection theory. The method consists of three phases. First, features are extracted by making use of a multi-scale image Transformer architecture. Then, these features are fed into a U-shaped Normalizing Flow that lays the theoretical foundations for the last phase, which computes a pixel-level anomaly map, and performs a segmentation based on the a contrario framework. This multiple hypothesis testing strategy permits to derive a robust automatic detection threshold, which is key in many real-world applications, where an operational point is needed. The segmentation results are evaluated using the Intersection over Union (IoU) metric, and for assessing the generated anomaly maps we report the area under the Receiver Operating Characteristic curve (ROC-AUC) at both image and pixel level. For both metrics, the proposed approach produces state-of-the-art results, ranking first in most MvTec-AD categories, with a mean pixel-level ROC- AUC of 98.74%. Code and trained models are available at https://github.com/mtailanian/uflow.
翻译:在这项工作中,我们建议一种非重叠的方法,用于图像异常的探测和分解,这种方法既得益于现代机器学习方法,又得益于更经典的统计检测理论。该方法由三个阶段组成。首先,通过使用多尺度图像变异结构来提取特征。然后,这些特征被输入一个U形的标准化流程,为最后一个阶段奠定理论基础,该流程将计算像素级异常图,并根据一个相反的框架进行分解。这一多重假设测试战略允许得出一个强大的自动检测阈值,这是许多实际应用中的关键,需要操作点的地方。对分解结果进行了评估,使用了跨联盟(IoU)的跨区测量,并用于评估生成的异常地图,我们在图像和像素级的收视器操作特征曲线(ROC-AUSC)下报告区域。对于这两种衡量标准,拟议方法产生“最新”结果,在大多数Mv Tec-AD类别中排名第一,在98.74/AUB/AMLA/C流中的平均位模型中,可得到。