Anomaly detection with only prior knowledge from normal samples attracts more attention because of the lack of anomaly samples. Existing CNN-based pixel reconstruction approaches suffer from two concerns. First, the reconstruction source and target are raw pixel values that contain indistinguishable semantic information. Second, CNN tends to reconstruct both normal samples and anomalies well, making them still hard to distinguish. In this paper, we propose Anomaly Detection TRansformer (ADTR) to apply a transformer to reconstruct pre-trained features. The pre-trained features contain distinguishable semantic information. Also, the adoption of transformer limits to reconstruct anomalies well such that anomalies could be detected easily once the reconstruction fails. Moreover, we propose novel loss functions to make our approach compatible with the normal-sample-only case and the anomaly-available case with both image-level and pixel-level labeled anomalies. The performance could be further improved by adding simple synthetic or external irrelevant anomalies. Extensive experiments are conducted on anomaly detection datasets including MVTec-AD and CIFAR-10. Our method achieves superior performance compared with all baselines.
翻译:由于缺乏异常样本,以普通样本的先前知识为唯一知识进行异常检测会受到更多关注。现有的CNN像素重建方法存在两个问题。首先,重建源和目标是原始像素值,含有无法区分的语义信息。第二,CNN往往对正常样本和异常进行良好的重建,使其仍然难以区分。在本文中,我们建议Anocally Setroxex(ADTRT)应用变压器来重建培训前的特征。培训前的特征包含可辨别的语义信息。此外,采用变压器限制来重建异常现象,这样一旦重建失败,就很容易发现异常现象。此外,我们建议采用新的损失功能,使我们的方法与普通的样本案例和图像级别和像素等级标签的异常案例相匹配。通过添加简单的合成或外部无关的异常,可以进一步改进性能。对异常检测数据集进行了广泛的实验,包括MVTec-AD和CIFAR-10。我们的方法比所有基线都具有更高的性能。