TubeR: 用于视频动作探测的 Tubelet 变换器 (TubeR: Tubelet Transformer for Video Action Detection) - 专知论文

会员服务 ·

0

Performer · 变换 · state-of-the-art · MoDELS · SimPLe ·

2021 年 12 月 6 日

TubeR: Tubelet Transformer for Video Action Detection

翻译：TubeR: 用于视频动作探测的 Tubelet 变换器

Jiaojiao Zhao,Yanyi Zhang,Xinyu Li,Hao Chen,Shuai Bing,Mingze Xu,Chunhui Liu,Kaustav Kundu,Yuanjun Xiong,Davide Modolo,Ivan Marsic,Cees G. M. Snoek,Joseph Tighe

We propose TubeR: a simple solution for spatio-temporal video action detection. Different from existing methods that depend on either an off-line actor detector or hand-designed actor-positional hypotheses like proposals or anchors, we propose to directly detect an action tubelet in a video by simultaneously performing action localization and recognition from a single representation. TubeR learns a set of tubelet-queries and utilizes a tubelet-attention module to model the dynamic spatio-temporal nature of a video clip, which effectively reinforces the model capacity compared to using actor-positional hypotheses in the spatio-temporal space. For videos containing transitional states or scene changes, we propose a context aware classification head to utilize short-term and long-term context to strengthen action classification, and an action switch regression head for detecting the precise temporal action extent. TubeR directly produces action tubelets with variable lengths and even maintains good results for long video clips. TubeR outperforms the previous state-of-the-art on commonly used action detection datasets AVA, UCF101-24 and JHMDB51-21.

翻译：我们提议了TubeR:一个用于时空视频动作探测的简单解决方案。与现有方法不同,这些方法既依赖于离线的行为者探测器,也依赖于手设计的行为者位置假设,如建议或锚,我们提议通过同时执行动作定位和单个代表的识别,在视频中直接检测一个动作管管管。TubeR学习了一组管式管式queries,并利用一个管式感应模块来模拟一个视频片段的动态空间-时空性质,这与在spatio-时间空间使用演员定位假说相比,有效地加强了模型能力。对于包含过渡状态或场景变化的视频,我们提议了一个有意识的背景分类头,以利用短期和长期的环境加强行动分类,以及一个动作开关回归头,以探测精确的时间动作范围。TubeR直接生成了具有不同长度的动作管子,甚至保持长视频片段的良好结果。TubeR超越了以前在常用动作检测数据集AVA、UCF101-24和JDB-21上使用的状态。

0

相关内容

Performer

近期必读的5篇顶会CVPR 2021【视频理解】相关论文和代码

专知会员服务

38+阅读 · 2021年3月31日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

专知会员服务

38+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

极市平台

3+阅读 · 2017年11月30日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

OadTR: Online Action Detection with Transformers

Arxiv

7+阅读 · 2021年6月21日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Coarse-Fine Networks for Temporal Activity Detection in Videos

Arxiv

3+阅读 · 2021年3月1日

Few-shot Scene-adaptive Anomaly Detection

Few-shot Scene-adaptive Anomaly Detection

Arxiv

8+阅读 · 2020年7月15日

Object Detection in Videos by High Quality Object Linking

Arxiv

4+阅读 · 2019年4月8日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

Object Detection in Videos by Short and Long Range Object Linking

Arxiv

6+阅读 · 2018年1月30日

Spatial-Temporal Memory Networks for Video Object Detection

Arxiv

4+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

近期必读的5篇顶会CVPR 2021【视频理解】相关论文和代码

专知会员服务

38+阅读 · 2021年3月31日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

专知会员服务

38+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

极市平台

3+阅读 · 2017年11月30日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

相关论文

OadTR: Online Action Detection with Transformers

Arxiv

7+阅读 · 2021年6月21日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Coarse-Fine Networks for Temporal Activity Detection in Videos

Arxiv

3+阅读 · 2021年3月1日

Few-shot Scene-adaptive Anomaly Detection

Few-shot Scene-adaptive Anomaly Detection

Arxiv

8+阅读 · 2020年7月15日

Object Detection in Videos by High Quality Object Linking

Arxiv

4+阅读 · 2019年4月8日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

Object Detection in Videos by Short and Long Range Object Linking

Arxiv

6+阅读 · 2018年1月30日

Spatial-Temporal Memory Networks for Video Object Detection

Arxiv

4+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员