DSANet:视频级别代表制学习动态部分聚合网络 (DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning) - 专知论文

会员服务 ·

0

视频分类 · MoDELS · Extensibility · 表示学习 · Networking ·

2021 年 8 月 17 日

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

翻译：DSANet:视频级别代表制学习动态部分聚合网络

Wenhao Wu,Yuxiang Zhao,Yanwu Xu,Xiao Tan,Dongliang He,Zhikang Zou,Jin Ye,Yingying Li,Mingde Yao,Zichao Dong,Yifeng Shi

from arxiv, Accepted to ACMMM2021

Long-range and short-range temporal modeling are two complementary and crucial aspects of video recognition. Most of the state-of-the-arts focus on short-range spatio-temporal modeling and then average multiple snippet-level predictions to yield the final video-level prediction. Thus, their video-level prediction does not consider spatio-temporal features of how video evolves along the temporal dimension. In this paper, we introduce a novel Dynamic Segment Aggregation (DSA) module to capture relationship among snippets. To be more specific, we attempt to generate a dynamic kernel for a convolutional operation to aggregate long-range temporal information among adjacent snippets adaptively. The DSA module is an efficient plug-and-play module and can be combined with the off-the-shelf clip-based models (i.e., TSM, I3D) to perform powerful long-range modeling with minimal overhead. The final video architecture, coined as DSANet. We conduct extensive experiments on several video recognition benchmarks (i.e., Mini-Kinetics-200, Kinetics-400, Something-Something V1 and ActivityNet) to show its superiority. Our proposed DSA module is shown to benefit various video recognition models significantly. For example, equipped with DSA modules, the top-1 accuracy of I3D ResNet-50 is improved from 74.9% to 78.2% on Kinetics-400. Codes are available at https://github.com/whwu95/DSANet.

翻译：长程和短程时间建模是视频识别的两个互补和关键方面。大多数最先进的艺术侧重于短程时空建模,然后是平均多片级预测,以得出最后视频级的预测。因此,其视频级预测不考虑视频在时间维度上如何演进的片段时空特征。在本文件中,我们引入了一个新型的动态部分聚合模块(DSA)以捕捉片片间的关系。更具体地说,我们试图生成一个动态内核,用于在相邻的狙击场间进行动态操作,以汇总长程长时间建模。DSA模块是一个高效的插接和播放模块,可以与视频外机基模型(即D.e.,TSM,I3D)结合,以最小的间接模式进行强有力的远程建模。最后的视频结构,以DSANet的形式生成。我们在多个视频识别基准上进行广泛的实验(i.e,Mini-Kinetits-200,Kintical-Net1-400, KiniticalSA-400) 模块展示了其高级的V.

0

相关内容

视频分类

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

专知会员服务

70+阅读 · 2021年3月7日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

136+阅读 · 2020年3月8日

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

专知会员服务

98+阅读 · 2019年11月20日

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

专知会员服务

74+阅读 · 2019年11月20日

Graph Neural Network（GNN）最全资源整理分享

Graph Neural Network（GNN）最全资源整理分享

深度学习与NLP

339+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

StARformer: Transformer with State-Action-Reward Representations

StARformer: Transformer with State-Action-Reward Representations

Arxiv

0+阅读 · 2021年10月12日

Joint Inductive and Transductive Learning for Video Object Segmentation

Arxiv

5+阅读 · 2021年8月8日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Dynamic Region-Aware Convolution

Arxiv

7+阅读 · 2021年3月15日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Arxiv

8+阅读 · 2020年12月7日

Dual Temporal Memory Network for Efficient Video Object Segmentation

Dual Temporal Memory Network for Efficient Video Object Segmentation

Arxiv

5+阅读 · 2020年3月13日

Object-Contextual Representations for Semantic Segmentation

Object-Contextual Representations for Semantic Segmentation

Arxiv

7+阅读 · 2019年11月19日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Path Aggregation Network for Instance Segmentation

Arxiv

3+阅读 · 2018年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

【UCLA】动态图表示学习，40页ppt，Dynamic Graph Representation Learning

专知会员服务

70+阅读 · 2021年3月7日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

136+阅读 · 2020年3月8日

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

【WSDM 2020 论文】基于自关注网络的动态图表示学习（Dynamic graph representation learning via self-attention networks），Visa Research的研究员武延宏等

专知会员服务

98+阅读 · 2019年11月20日

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

专知会员服务

74+阅读 · 2019年11月20日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能技术提升军事不确定性环境下领导决策能力研究》180页

以机器速度锁定目标：人工智能的能力与局限

中文版 | 革新国家安全：国防情报离线本地部署大语言模型

《美军21世纪医疗抵消战略》

相关资讯

Graph Neural Network（GNN）最全资源整理分享

Graph Neural Network（GNN）最全资源整理分享

深度学习与NLP

339+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

StARformer: Transformer with State-Action-Reward Representations

StARformer: Transformer with State-Action-Reward Representations

Arxiv

0+阅读 · 2021年10月12日

Joint Inductive and Transductive Learning for Video Object Segmentation

Arxiv

5+阅读 · 2021年8月8日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Dynamic Region-Aware Convolution

Arxiv

7+阅读 · 2021年3月15日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Arxiv

8+阅读 · 2020年12月7日

Dual Temporal Memory Network for Efficient Video Object Segmentation

Dual Temporal Memory Network for Efficient Video Object Segmentation

Arxiv

5+阅读 · 2020年3月13日

Object-Contextual Representations for Semantic Segmentation

Object-Contextual Representations for Semantic Segmentation

Arxiv

7+阅读 · 2019年11月19日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Path Aggregation Network for Instance Segmentation

Arxiv

3+阅读 · 2018年3月5日

微信扫码咨询专知VIP会员