Video anomaly detection under weak labels is formulated as a typical multiple-instance learning problem in previous works. In this paper, we provide a new perspective, i.e., a supervised learning task under noisy labels. In such a viewpoint, as long as cleaning away label noise, we can directly apply fully supervised action classifiers to weakly supervised anomaly detection, and take maximum advantage of these well-developed classifiers. For this purpose, we devise a graph convolutional network to correct noisy labels. Based upon feature similarity and temporal consistency, our network propagates supervisory signals from high-confidence snippets to low-confidence ones. In this manner, the network is capable of providing cleaned supervision for action classifiers. During the test phase, we only need to obtain snippet-wise predictions from the action classifier without any extra post-processing. Extensive experiments on 3 datasets at different scales with 2 types of action classifiers demonstrate the efficacy of our method. Remarkably, we obtain the frame-level AUC score of 82.12% on UCF-Crime.
翻译:在薄弱标签下检测视频异常现象被描述为以往作品中典型的多因学习问题。 在本文中,我们提供了一个新的视角,即在吵闹标签下监督学习任务。 在这样一个视角中,只要清除标签噪音,我们就可以直接应用完全监督的行动分类器来进行监管不力的异常检测,并最大限度地利用这些完善的分类器。为此目的,我们设计了一个图表革命网络来纠正吵闹标签。基于特征相似性和时间一致性,我们的网络将高信任狙击棒的监督信号传播给低信任者。这样,网络就能为行动分类者提供清洁的监督。在测试阶段,我们只需要在没有任何额外后处理的情况下,从行动分类器获得随机预测。对不同规模的3个数据集进行广泛的实验,有2类行动分类器展示了我们方法的功效。值得注意的是,我们获得了关于UC-犯罪82.12%的AUC标准。