It is challenging for artificial intelligence systems to achieve accurate video recognition under the scenario of low computation costs. Adaptive inference based efficient video recognition methods typically preview videos and focus on salient parts to reduce computation costs. Most existing works focus on complex networks learning with video classification based objectives. Taking all frames as positive samples, few of them pay attention to the discrimination between positive samples (salient frames) and negative samples (non-salient frames) in supervisions. To fill this gap, in this paper, we propose a novel Non-saliency Suppression Network (NSNet), which effectively suppresses the responses of non-salient frames. Specifically, on the frame level, effective pseudo labels that can distinguish between salient and non-salient frames are generated to guide the frame saliency learning. On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations. Saliency measurements from both two levels are combined for exploitation of multi-granularity complementary information. Extensive experiments conducted on four well-known benchmarks verify our NSNet not only achieves the state-of-the-art accuracy-efficiency trade-off but also present a significantly faster (2.4~4.3x) practical inference speed than state-of-the-art methods. Our project page is at https://lawrencexia2008.github.io/projects/nsnet .
翻译:对人工智能系统来说,在低计算成本的情景下,实现准确的视频识别是具有挑战性的。基于适应性推断的高效视频识别方法通常会预览视频,并侧重于突出部分以减少计算成本。大多数现有工作侧重于以视频分类为目的的复杂网络学习。将所有框架作为积极的样本,其中很少有人关注监管中正面样本(高度框架)和负面样本(非高度框架)之间的差别。为了填补这一空白,我们在本文件中提议建立一个新型的“非认知性禁止网络(NSNet) ” (NSNet) (NSNet) (NSNet) (NS-2008) (NS-A) (NS-Net) 有效抑制非高度框架的反应,以降低计算成本成本。具体而言,在框架级别上,生成有效的假标签,能够区分突出和非高度框架的图像分类。在视频层面,通过对显著的视频层面的监控,在显著程度上了解了正面样本(Sidentrence)和非高度匹配信息。两个层面的清晰度测量数据来自两个层面,用于利用多特征互补信息。在四个众所周知的基准上进行广泛的实验,以核实我们的NS-Net-Net-Net-net-nex-deal-deal-de-de-flistal-first-first-first-first-first-first-first-first-first-first-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-frus-france-fal-frus-fal-fal-fal-fal-fal-frus-fal-frent-fal-frent-ferd-fal-st-fal-st-st-fal-fal-fal-fal-st-f-f-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-fal-f