视频识别慢速网络 (SlowFast Networks for Video Recognition) - 专知论文

会员服务 ·

0

视频分类 · Networking · 可约的 · FAST · INFORMS ·

2018 年 12 月 10 日

SlowFast Networks for Video Recognition

翻译：视频识别慢速网络

Christoph Feichtenhofer,Haoqi Fan,Jitendra Malik,Kaiming He

from arxiv, Technical report

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report 79.0% accuracy on the Kinetics dataset without using any pre-training, largely surpassing the previous best results of this kind. On AVA action detection we achieve a new state-of-the-art of 28.3 mAP. Code will be made publicly available.

翻译：我们提出“慢速”视频识别网络。我们的模型包括 (一) 慢速路径,以低框架速率运行,以捕捉空间语义学,和 (二) 快速路径,以高框架速率运行,以精确的时间分辨率捕捉运动。快速路径可以通过降低频道容量而变得非常轻,但可以学习有用的时间信息来进行视频识别。我们的模型在视频行动分类和检测两方面都取得了很强的性能,而大量改进被作为“慢框架”概念的贡献。我们报告动因数据集的准确率79.0%,而没有使用任何预培训,大大超过先前的最佳结果。在AVA行动探测中,我们将实现28.3 mAP的新艺术状态。代码将公布于众。

19

相关内容

视频分类

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

专知会员服务

76+阅读 · 2020年1月16日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

CVPR2019| 南开大学、Facebook等13篇CVPR论文及源码推荐（显著性检测/实例分割/人脸识别/视频动作识别等）

CVPR2019| 南开大学、Facebook等13篇CVPR论文及源码推荐（显著性检测/实例分割/人脸识别/视频动作识别等）

极市平台

11+阅读 · 2019年5月23日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-10更新34篇论文及代码（6篇oral、含FAIR工作、视频跟踪、人脸识别、动作识别等）

CVPR2019| 04-10更新34篇论文及代码（6篇oral、含FAIR工作、视频跟踪、人脸识别、动作识别等）

极市平台

16+阅读 · 2019年4月10日

视频理解 S3D，I3D-GCN，SlowFastNet, LFB

视频理解 S3D，I3D-GCN，SlowFastNet, LFB

极市平台

7+阅读 · 2019年1月31日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Fast R-CNN

数据挖掘入门与实战

3+阅读 · 2018年4月20日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Gated Channel Transformation for Visual Recognition

Arxiv

4+阅读 · 2020年3月27日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Global-and-local attention networks for visual recognition

Global-and-local attention networks for visual recognition

Arxiv

5+阅读 · 2018年9月6日

Video Object Detection with an Aligned Spatial-Temporal Memory

Video Object Detection with an Aligned Spatial-Temporal Memory

Arxiv

4+阅读 · 2018年7月27日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Deep Learning for Video Classification and Captioning

Arxiv

9+阅读 · 2018年2月22日

Spatial-Temporal Memory Networks for Video Object Detection

Arxiv

4+阅读 · 2017年12月18日

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

Arxiv

4+阅读 · 2017年11月27日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

专知会员服务

76+阅读 · 2020年1月16日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】用于化学结构抽取的多模态文档理解

《人工智能在军事行动作战规划过程中的应用可能性》

【NeurIPS2025】熵正则化与分布强化学习的收敛定理

智能体安全综述：应用、威胁与防御

相关资讯

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

CVPR2019| 南开大学、Facebook等13篇CVPR论文及源码推荐（显著性检测/实例分割/人脸识别/视频动作识别等）

CVPR2019| 南开大学、Facebook等13篇CVPR论文及源码推荐（显著性检测/实例分割/人脸识别/视频动作识别等）

极市平台

11+阅读 · 2019年5月23日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-10更新34篇论文及代码（6篇oral、含FAIR工作、视频跟踪、人脸识别、动作识别等）

CVPR2019| 04-10更新34篇论文及代码（6篇oral、含FAIR工作、视频跟踪、人脸识别、动作识别等）

极市平台

16+阅读 · 2019年4月10日

视频理解 S3D，I3D-GCN，SlowFastNet, LFB

视频理解 S3D，I3D-GCN，SlowFastNet, LFB

极市平台

7+阅读 · 2019年1月31日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Fast R-CNN

数据挖掘入门与实战

3+阅读 · 2018年4月20日

相关论文

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Gated Channel Transformation for Visual Recognition

Arxiv

4+阅读 · 2020年3月27日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Global-and-local attention networks for visual recognition

Global-and-local attention networks for visual recognition

Arxiv

5+阅读 · 2018年9月6日

Video Object Detection with an Aligned Spatial-Temporal Memory

Video Object Detection with an Aligned Spatial-Temporal Memory

Arxiv

4+阅读 · 2018年7月27日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Deep Learning for Video Classification and Captioning

Arxiv

9+阅读 · 2018年2月22日

Spatial-Temporal Memory Networks for Video Object Detection

Arxiv

4+阅读 · 2017年12月18日

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

Arxiv

4+阅读 · 2017年11月27日

微信扫码咨询专知VIP会员