Recent efforts towards video anomaly detection (VAD) try to learn a deep autoencoder to describe normal event patterns with small reconstruction errors. The video inputs with large reconstruction errors are regarded as anomalies at the test time. However, these methods sometimes reconstruct abnormal inputs well because of the powerful generalization ability of deep autoencoder. To address this problem, we present a novel approach for anomaly detection, which utilizes discriminative prototypes of normal data to reconstruct video frames. In this way, the model will favor the reconstruction of normal events and distort the reconstruction of abnormal events. Specifically, we use a prototype-guided memory module to perform discriminative latent embedding. We introduce a new discriminative criterion for the memory module, as well as a loss function correspondingly, which can encourage memory items to record the representative embeddings of normal data, i.e. prototypes. Besides, we design a novel two-branch autoencoder, which is composed of a future frame prediction network and an RGB difference generation network that share the same encoder. The stacked RGB difference contains motion information just like optical flow, so our model can learn temporal regularity. We evaluate the effectiveness of our method on three benchmark datasets and experimental results demonstrate the proposed method outperforms the state-of-the-art.
翻译:最近为探测视频异常(VAD)所做的努力是尝试学习一个深自动编码器,以描述正常事件模式,其重建错误较小。具有重大重建错误的视频输入在测试时被视为异常。然而,这些方法有时由于深自动编码器的强力普及能力而重建异常输入。为了解决这个问题,我们提出了一个新颖的异常检测方法,它利用正常数据的歧视性原型来重建视频框架。这样,模型将有利于重建正常事件并扭曲异常事件的重建。具体地说,我们使用原型导记忆模块来进行歧视性潜嵌入。我们为记忆模块引入了新的歧视性标准,以及相应的损失功能,这可以鼓励记忆项目记录正常数据(即原型)的代表性嵌入。此外,我们设计了一个新型的两边自动编码器自动编码器,由未来框架预测网络和RGB差异生成网络组成,共享相同的编码。堆式RGB差异包含像光学流那样的移动信息,因此我们的模型可以学习时间规律性实验结果。我们评估了我们拟议的数据方法的有效性。