Video anomaly detection has gained significant attention due to the increasing requirements of automatic monitoring for surveillance videos. Especially, the prediction based approach is one of the most studied methods to detect anomalies by predicting frames that include abnormal events in the test set after learning with the normal frames of the training set. However, a lot of prediction networks are computationally expensive owing to the use of pre-trained optical flow networks, or fail to detect abnormal situations because of their strong generative ability to predict even the anomalies. To address these shortcomings, we propose spatial rotation transformation (SRT) and temporal mixing transformation (TMT) to generate irregular patch cuboids within normal frame cuboids in order to enhance the learning of normal features. Additionally, the proposed patch transformation is used only during the training phase, allowing our model to detect abnormal frames at fast speed during inference. Our model is evaluated on three anomaly detection benchmarks, achieving competitive accuracy and surpassing all the previous works in terms of speed.
翻译:由于对监视录像进行自动监测的要求日益增加,录像异常现象的探测工作已引起人们的极大关注。特别是,预测法是研究最多的方法之一,通过预测框架来检测异常现象,其中包括在与培训成套正常框架学习后在测试中出现的异常事件;然而,许多预测网络由于使用经过预先训练的光学流动网络而计算费用昂贵,或者由于它们有很强的遗传能力来预测甚至异常现象而未能检测异常情况。为了解决这些缺陷,我们提议在正常框架的幼崽内进行空间旋转变换和时间混合变换(TMT),以产生非正常的补丁幼崽,以加强对正常特征的学习。此外,拟议的补丁变换仅在培训阶段使用,使我们的模型在推断过程中能够以快速的速度探测异常情况。我们的模型根据三个异常检测基准进行评估,达到竞争性的准确性,并超越了所有以前的速度。