Anomaly detection in surveillance videos is challenging and important for ensuring public security. Different from pixel-based anomaly detection methods, pose-based methods utilize highly-structured skeleton data, which decreases the computational burden and also avoids the negative impact of background noise. However, unlike pixel-based methods, which could directly exploit explicit motion features such as optical flow, pose-based methods suffer from the lack of alternative dynamic representation. In this paper, a novel Motion Embedder (ME) is proposed to provide a pose motion representation from the probability perspective. Furthermore, a novel task-specific Spatial-Temporal Transformer (STT) is deployed for self-supervised pose sequence reconstruction. These two modules are then integrated into a unified framework for pose regularity learning, which is referred to as Motion Prior Regularity Learner (MoPRL). MoPRL achieves the state-of-the-art performance by an average improvement of 4.7% AUC on several challenging datasets. Extensive experiments validate the versatility of each proposed module.
翻译:与基于像素的异常检测方法不同的是,基于布局的方法使用高度结构化的骨骼数据,从而减少计算负担,并避免背景噪音的负面影响。然而,与直接利用光学流等显性运动特征的像素方法不同,基于布局的方法缺乏其他动态代表,因此缺乏其他动态代表。本文建议采用新的动画嵌入器(ME),从概率角度提供一种外形代表。此外,为自我监督的外形序列重建,采用了一个新的特定任务空间时变换器(STT),然后将这两个模块纳入一个统一框架,作为常态学习,称为常态常态常态学习器(MOPRL)。MOPRL通过对若干具有挑战性的数据集平均改进4.7%的AUC(ME)实现最新性表现。广泛的实验验证了每个拟议模块的多功能性。