3D 人类康复游戏 (Playing for 3D Human Recovery) - 专知论文

会员服务 ·

0

3D · 缩放 · Performer · 多样性 · MoDELS ·

2021 年 10 月 14 日

Playing for 3D Human Recovery

翻译：3D 人类康复游戏

Zhongang Cai,Mingyuan Zhang,Jiawei Ren,Chen Wei,Daxuan Ren,Jiatong Li,Zhengyu Lin,Haiyu Zhao,Shuai Yi,Lei Yang,Chen Change Loy,Ziwei Liu

Image- and video-based 3D human recovery (i.e. pose and shape estimation) have achieved substantial progress. However, due to the prohibitive cost of motion capture, existing datasets are often limited in scale and diversity, which hinders the further development of more powerful models. In this work, we obtain massive human sequences as well as their 3D ground truths by playing video games. Specifically, we contribute, GTA-Human, a mega-scale and highly-diverse 3D human dataset generated with the GTA-V game engine. With a rich set of subjects, actions, and scenarios, GTA-Human serves as both an effective training source. Notably, the "unreasonable effectiveness of data" phenomenon is validated in 3D human recovery using our game-playing data. A simple frame-based baseline trained on GTA-Human already outperforms more sophisticated methods by a large margin; for video-based methods, GTA-Human demonstrates superiority over even the in-domain training set. We extend our study to larger models to observe the same consistent improvements, and the study on supervision signals suggests the rich collection of SMPL annotations is key. Furthermore, equipped with the diverse annotations in GTA-Human, we systematically investigate the performance of various methods under a wide spectrum of real-world variations, e.g. camera angles, poses, and occlusions. We hope our work could pave way for scaling up 3D human recovery to the real world.

翻译：以图像和视频为基础的3D人类恢复(即成形和形状估计)取得了巨大进展,然而,由于动作捕获成本高得令人望而却步,现有的数据集在规模和多样性上往往有限,阻碍了更强大的模型的进一步发展。在这项工作中,我们通过玩电子游戏获得了大量的人类序列及其3D地面真象。具体地说,我们为GTA-Hen作出了贡献,这是一个由GTA-V游戏引擎产生的大型和高度多样化的3D人类数据集。我们把研究扩大到更大的模型,以观察相同的改进、行动和情景,GTA-H人既是有效的培训来源。值得注意的是,“数据不合理的有效性”现象在3D人类恢复中得到验证,利用我们的游戏游戏数据得到进一步开发。一个简单的基于框架的基线,已经大大超越了更复杂的方法。对于基于视频的方法,GTA-H人类显示的优越性甚至超越了现场训练。我们把研究扩大到更大的模型,以观察同样的改进,关于监督信号的研究表明,“SMPL 数据”现象在3D上得到了丰富的真实的收集,而具有全球范围变化的SMTA图象系统化的模型是关键。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

专知会员服务

18+阅读 · 2019年6月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

机器学习研究会

6+阅读 · 2017年8月8日

Class-agnostic Reconstruction of Dynamic Objects from Videos

Arxiv

0+阅读 · 2021年12月3日

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

Arxiv

0+阅读 · 2021年12月3日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Deep Learning for 3D Point Clouds: A Survey

Deep Learning for 3D Point Clouds: A Survey

Arxiv

3+阅读 · 2019年12月27日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Occupancy Networks: Learning 3D Reconstruction in Function Space

Arxiv

10+阅读 · 2018年12月10日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

3D Reconstruction in Canonical Co-ordinate Space from Arbitrarily Oriented 2D Images

Arxiv

4+阅读 · 2018年1月23日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

专知会员服务

18+阅读 · 2019年6月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

机器学习研究会

6+阅读 · 2017年8月8日

相关论文

Class-agnostic Reconstruction of Dynamic Objects from Videos

Arxiv

0+阅读 · 2021年12月3日

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

Arxiv

0+阅读 · 2021年12月3日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Deep Learning for 3D Point Clouds: A Survey

Deep Learning for 3D Point Clouds: A Survey

Arxiv

3+阅读 · 2019年12月27日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Occupancy Networks: Learning 3D Reconstruction in Function Space

Arxiv

10+阅读 · 2018年12月10日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

3D Reconstruction in Canonical Co-ordinate Space from Arbitrarily Oriented 2D Images

Arxiv

4+阅读 · 2018年1月23日

微信扫码咨询专知VIP会员