3DV: 3D 动态Voxel 用于在深度视频中行动识别 (3DV: 3D Dynamic Voxel for Action Recognition in Depth Video) - 专知论文

会员服务 ·

0

3DV · 3D · INFORMS · 情景 · Extensibility ·

2020 年 5 月 12 日

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

翻译：3DV: 3D 动态Voxel 用于在深度视频中行动识别

Yancheng Wang,Yang Xiao,Fu Xiong,Wenxiang Jiang,Zhiguo Cao,Joey Tianyi Zhou,Junsong Yuan

from arxiv, Accepted by CVPR2020

To facilitate depth-based 3D action recognition, 3D dynamic voxel (3DV) is proposed as a novel 3D motion representation. With 3D space voxelization, the key idea of 3DV is to encode 3D motion information within depth video into a regular voxel set (i.e., 3DV) compactly, via temporal rank pooling. Each available 3DV voxel intrinsically involves 3D spatial and motion feature jointly. 3DV is then abstracted as a point set and input into PointNet++ for 3D action recognition, in the end-to-end learning way. The intuition for transferring 3DV into the point set form is that, PointNet++ is lightweight and effective for deep feature learning towards point set. Since 3DV may lose appearance clue, a multi-stream 3D action recognition manner is also proposed to learn motion and appearance feature jointly. To extract richer temporal order information of actions, we also divide the depth video into temporal splits and encode this procedure in 3DV integrally. The extensive experiments on 4 well-established benchmark datasets demonstrate the superiority of our proposition. Impressively, we acquire the accuracy of 82.4% and 93.5% on NTU RGB+D 120 [13] with the cross-subject and crosssetup test setting respectively. 3DV's code is available at https://github.com/3huo/3DV-Action.

翻译：为了便利基于深度的 3D 3D 动作识别, 3D 动态 voxel (3DV) 将3D 动态 voxel (3D 3D 3D V) 作为一种新型 3D 动作演示。 3D V 3D 3D 3D 空间 voxel (3D 3D 3D 3D 3D 3D 3D 动作演示, 3D 3D 3D 3D 3xelization 3D 3D 3D 3D 3 的直觉将3D 运动信息在深度视频中通过时间级集合将3D 运动信息集中编码成普通的 Voxel 3V 3D 3D 3D 3D 3D 3D 3 3D 3D 3 3D 3D 3 3D 3D 3D 3D 3D 3 3D 3D 3D 3 3 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3 3 3 3 3D 3 3 3D 3D 3D 3 3 3 3 3D 3D 3 3 3 3 3 3 3 3D 3 3 3 3D 3D 3 3 3 3 3 3 3 3D 3 3 3 3

0

相关内容

3DV

3DV（3D视觉）会议提供了一个绝佳的平台，用于传播研究结果，涵盖计算机视觉和图形3D研究领域的广泛主题，包括新型光学传感器，信号处理，几何建模，表示和传输，可视化和交互以及各种应用程序。官网地址：https://dblp.uni-trier.de/db/conf/3dim/

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

专知会员服务

45+阅读 · 2020年4月17日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【厦门大学】综述：深度学习3D点云分割，Review: deep learning on 3D point clouds

【厦门大学】综述：深度学习3D点云分割，Review: deep learning on 3D point clouds

专知会员服务

71+阅读 · 2020年1月22日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

CVPR2019| 04-30更新23篇论文及代码合集（3篇oral，含3D目标检测/语义分割/动作识别等）

CVPR2019| 04-30更新23篇论文及代码合集（3篇oral，含3D目标检测/语义分割/动作识别等）

极市平台

29+阅读 · 2019年4月30日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-17更新17篇论文及代码（目标检测、语义分割、损失函数、姿态估计等）

CVPR2019| 04-17更新17篇论文及代码（目标检测、语义分割、损失函数、姿态估计等）

极市平台

24+阅读 · 2019年4月17日

CVPR2019 | 03-25日更新12篇论文及代码汇总（目标检测、姿态估计、跟踪、VQA等）

CVPR2019 | 03-25日更新12篇论文及代码汇总（目标检测、姿态估计、跟踪、VQA等）

极市平台

5+阅读 · 2019年3月25日

CVPR2019 | 03-13日更新16篇论文及代码汇总（行人重识别、人体姿态估计、GAN、手写体识别等）

CVPR2019 | 03-13日更新16篇论文及代码汇总（行人重识别、人体姿态估计、GAN、手写体识别等）

极市平台

7+阅读 · 2019年3月13日

CVPR2019 | 03-12日更新13篇论文汇总（包括语义分割、领域自适应、视频分析等）

CVPR2019 | 03-12日更新13篇论文汇总（包括语义分割、领域自适应、视频分析等）

极市平台

14+阅读 · 2019年3月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Arxiv

7+阅读 · 2018年12月11日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Visual Tracking via Dynamic Graph Learning

Arxiv

5+阅读 · 2018年4月30日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection

Arxiv

8+阅读 · 2018年2月21日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

VIP会员

文章信息

相关主题

相关VIP内容

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

CVPR2020 | 商汤-港中文等提出PV-RCNN：3D目标检测新网络

专知会员服务

45+阅读 · 2020年4月17日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【厦门大学】综述：深度学习3D点云分割，Review: deep learning on 3D point clouds

【厦门大学】综述：深度学习3D点云分割，Review: deep learning on 3D point clouds

专知会员服务

71+阅读 · 2020年1月22日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

CVPR2019| 04-30更新23篇论文及代码合集（3篇oral，含3D目标检测/语义分割/动作识别等）

CVPR2019| 04-30更新23篇论文及代码合集（3篇oral，含3D目标检测/语义分割/动作识别等）

极市平台

29+阅读 · 2019年4月30日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-17更新17篇论文及代码（目标检测、语义分割、损失函数、姿态估计等）

CVPR2019| 04-17更新17篇论文及代码（目标检测、语义分割、损失函数、姿态估计等）

极市平台

24+阅读 · 2019年4月17日

CVPR2019 | 03-25日更新12篇论文及代码汇总（目标检测、姿态估计、跟踪、VQA等）

CVPR2019 | 03-25日更新12篇论文及代码汇总（目标检测、姿态估计、跟踪、VQA等）

极市平台

5+阅读 · 2019年3月25日

CVPR2019 | 03-13日更新16篇论文及代码汇总（行人重识别、人体姿态估计、GAN、手写体识别等）

CVPR2019 | 03-13日更新16篇论文及代码汇总（行人重识别、人体姿态估计、GAN、手写体识别等）

极市平台

7+阅读 · 2019年3月13日

CVPR2019 | 03-12日更新13篇论文汇总（包括语义分割、领域自适应、视频分析等）

CVPR2019 | 03-12日更新13篇论文汇总（包括语义分割、领域自适应、视频分析等）

极市平台

14+阅读 · 2019年3月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

相关论文

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Arxiv

7+阅读 · 2018年12月11日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Visual Tracking via Dynamic Graph Learning

Arxiv

5+阅读 · 2018年4月30日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection

Arxiv

8+阅读 · 2018年2月21日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

微信扫码咨询专知VIP会员