Deep autoencoders provide an effective tool for learning non-linear dimensionality reduction in an unsupervised way. Recently, they have been used for the task of anomaly detection in the visual domain. By optimizing for the reconstruction error using anomaly-free examples, the common belief is that a corresponding network should fail to accurately reconstruct anomalous regions in the application phase. This goal is typically addressed by controlling the capacity of the network, either by reducing the size of the bottleneck layer or by enforcing sparsity constraints on the activations. However, neither of these techniques does explicitly penalize reconstruction of anomalous signals often resulting in poor detection. We tackle this problem by adapting a self-supervised learning regime that allows the use of discriminative information during training but focuses on the data manifold of normal examples. We emphasize that inference with our approach is very efficient during training and prediction requiring a single forward pass for each input image. Our experiments on the MVTec AD dataset demonstrate high detection and localization performance. On the texture-subset, in particular, our approach consistently outperforms recent anomaly detection methods by a significant margin.
 翻译:暂无翻译