Spadio-Tempio-Temporal 自我注意网络视频频度预测网络 (Spatio-Temporal Self-Attention Network for Video Saliency Prediction)

3D convolutional neural networks have achieved promising results for video tasks in computer vision, including video saliency prediction that is explored in this paper. However, 3D convolution encodes visual representation merely on fixed local spacetime according to its kernel size, while human attention is always attracted by relational visual features at different time. To overcome this limitation, we propose a novel Spatio-Temporal Self-Attention 3D Network (STSANet) for video saliency prediction, in which multiple Spatio-Temporal Self-Attention (STSA) modules are employed at different levels of 3D convolutional backbone to directly capture long-range relations between spatio-temporal features of different time steps. Besides, we propose an Attentional Multi-Scale Fusion (AMSF) module to integrate multi-level features with the perception of context in semantic and spatio-temporal subspaces. Extensive experiments demonstrate the contributions of key components of our method, and the results on DHF1K, Hollywood-2, UCF, and DIEM benchmark datasets clearly prove the superiority of the proposed model compared with all state-of-the-art models.

翻译：3D 进化神经网络在计算机视觉视频任务方面取得了可喜的成果,包括本文所探讨的视频显著预测。然而,3D进化编码仅仅根据内核大小在固定的当地时空时段的视觉代表,而人类的注意力总是在不同时间被关联视觉特征所吸引。为了克服这一限制,我们提议建立一个新型的Spatio-Temporal自控3D网络(STSANet),用于视频显著预测,其中多个Spatio-Tempal自控模块(STSA)用于3D 时空主干柱的不同级别,以直接捕捉不同时间步骤的spatio-时空特征之间的长距离关系。此外,我们提议建立一个注意性多空间组合模块,将多层次特征与对语管和时空子子子子子子子子空间环境的认识相结合。广泛的实验展示了我们方法的关键组成部分的贡献,DHF1K、好莱坞-2、UCF和DIEM基准数据集的结果,清楚地证明了拟议模型与所有状态比较的优越性。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

14+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日