视频行动识别对齐引导时间关注 (Alignment-guided Temporal Attention for Video Action Recognition) - 专知论文

会员服务 ·

0

INFORMS · Attention · INTERACT · Extensibility · 分解的 ·

2022 年 12 月 30 日

Alignment-guided Temporal Attention for Video Action Recognition

翻译：视频行动识别对齐引导时间关注

Yizhou Zhao,Zhenyang Li,Xun Guo,Yan Lu

from arxiv, Accepted by NeurIPS 2022

Temporal modeling is crucial for various video learning tasks. Most recent approaches employ either factorized (2D+1D) or joint (3D) spatial-temporal operations to extract temporal contexts from the input frames. While the former is more efficient in computation, the latter often obtains better performance. In this paper, we attribute this to a dilemma between the sufficiency and the efficiency of interactions among various positions in different frames. These interactions affect the extraction of task-relevant information shared among frames. To resolve this issue, we prove that frame-by-frame alignments have the potential to increase the mutual information between frame representations, thereby including more task-relevant information to boost effectiveness. Then we propose Alignment-guided Temporal Attention (ATA) to extend 1-dimensional temporal attention with parameter-free patch-level alignments between neighboring frames. It can act as a general plug-in for image backbones to conduct the action recognition task without any model-specific design. Extensive experiments on multiple benchmarks demonstrate the superiority and generality of our module.

翻译：时间建模对于各种视频学习任务至关重要。多数最新方法都采用因数化(2D+1D)或联合(3D)空间时空操作来从输入框中提取时间背景。虽然前者在计算时效率更高, 但后者往往能取得更好的性能。在本文中, 我们将此归因于不同框中各位置之间互动的充足性和效率之间的两难境地。这些互动会影响各框架之间共享的任务相关信息的提取。要解决这个问题, 我们证明框架对齐有可能增加框架代表之间的相互信息, 从而包括更多的任务相关信息来提高有效性。然后我们提议对齐- 制导时空注意(ATA) 扩大一维时间关注, 并在相邻框之间无参数的跨位匹配上进行。它可以作为图像主干部的一般插件, 在没有任何模型设计的情况下执行行动识别任务。在多个基准上进行广泛的实验, 显示了我们模块的优越性和一般性。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

lnc-CENPQ-2在颞叶内侧型癫痫发病机制中的作用

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Ghrelin/GHS-R1a通路参与情绪记忆调控的分子细胞及环路机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-449b调控CSFR1影响子宫内膜异位种植的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Activin A影响脊膜源性的GABA能神经祖细胞在体分化的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

再生障碍性贫血中TCR信号通路失调及其相关分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

强迫症患者多模态磁共振特特征及不同治疗方法对其影响的研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

下丘脑CRHR介导抑郁症发病的表观遗传机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

E-VFIA : Event-Based Video Frame Interpolation with Attention

Arxiv

0+阅读 · 2023年3月1日

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

Arxiv

0+阅读 · 2023年3月1日

Valid Information Guidance Network for Compressed Video Quality Enhancement

Arxiv

0+阅读 · 2023年2月28日

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

Arxiv

0+阅读 · 2023年2月27日

Large Scale Visual Food Recognition

Arxiv

0+阅读 · 2023年2月26日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

模型提取攻击与防御的系统综述：最新进展与展望

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

【CMU博士论文】用于物理模拟的高效深度学习模型

大模型解决方案白皮书：社交陪伴场景全流程落地指南

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

E-VFIA : Event-Based Video Frame Interpolation with Attention

Arxiv

0+阅读 · 2023年3月1日

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

Arxiv

0+阅读 · 2023年3月1日

Valid Information Guidance Network for Compressed Video Quality Enhancement

Arxiv

0+阅读 · 2023年2月28日

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

Arxiv

0+阅读 · 2023年2月27日

Large Scale Visual Food Recognition

Arxiv

0+阅读 · 2023年2月26日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

lnc-CENPQ-2在颞叶内侧型癫痫发病机制中的作用

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Ghrelin/GHS-R1a通路参与情绪记忆调控的分子细胞及环路机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-449b调控CSFR1影响子宫内膜异位种植的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Activin A影响脊膜源性的GABA能神经祖细胞在体分化的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

再生障碍性贫血中TCR信号通路失调及其相关分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

强迫症患者多模态磁共振特特征及不同治疗方法对其影响的研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

下丘脑CRHR介导抑郁症发病的表观遗传机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员