This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis. However, most existing methods focus on identifying local structural degeneration anomalies and often fail to detect high-level functional anomalies that involve logical constraints. To address this issue, we propose a two-branch approach that consists of a local branch for detecting structural anomalies and a global branch for detecting logical anomalies. To facilitate local-global feature correspondence, we introduce a novel semantic bottleneck enabled by the visual Transformer. Moreover, we develop feature estimation networks for each branch separately to detect anomalies. Our proposed framework is validated using various benchmarks, including industrial datasets, Mvtec AD, Mvtec Loco AD, and the Retinal-OCT medical dataset. Experimental results show that our method outperforms existing methods, particularly in detecting logical anomalies.
翻译:本文提出了一个新的框架,名为全球-地方通讯框架(GLCF),用于在逻辑限制下进行视觉异常检测;视觉异常检测已成为各种现实应用,如工业异常检测和医疗疾病诊断等,一个活跃的研究领域;然而,大多数现有方法侧重于查明当地结构退化异常现象,而且往往未能发现涉及逻辑限制的高级功能异常现象;为解决这一问题,我们建议采用由地方分支组成的双部门办法,以发现结构异常现象,以及发现逻辑异常现象的全球分支。为便利地方-全球特征通信,我们引入了由视觉变异器启用的新型语义瓶颈。此外,我们为每个分支单独开发了特征估计网络,以探测异常现象。我们提议的框架使用各种基准得到验证,包括工业数据集、Mvtec AD、Mvtec Loco AD和Retinal-OCT医疗数据集。实验结果表明,我们的方法超越了现有方法,特别是在发现逻辑异常方面。</s>