3D 利用空间和时空变换器进行人类粒子估计 (3D Human Pose Estimation with Spatial and Temporal Transformers) - 专知论文

会员服务 ·

0

估计/估计量 · 变换 · Extensibility · 3D · Vision ·

2021 年 3 月 18 日

3D Human Pose Estimation with Spatial and Temporal Transformers

翻译：3D 利用空间和时空变换器进行人类粒子估计

Ce Zheng,Sijie Zhu,Matias Mendieta,Taojiannan Yang,Chen Chen,Zhengming Ding

Transformer architectures have become the model of choice in natural language processing and are now being introduced into computer vision tasks such as image classification, object detection, and semantic segmentation. However, in the field of human pose estimation, convolutional architectures still remain dominant. In this work, we present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos without convolutional architectures involved. Inspired by recent developments in vision transformers, we design a spatial-temporal transformer structure to comprehensively model the human joint relations within each frame as well as the temporal correlations across frames, then output an accurate 3D human pose of the center frame. We quantitatively and qualitatively evaluate our method on two popular and standard benchmark datasets: Human3.6M and MPI-INF-3DHP. Extensive experiments show that PoseFormer achieves state-of-the-art performance on both datasets. Code is available at \url{https://github.com/zczcwh/PoseFormer}

翻译：在自然语言处理过程中,变形器结构已成为选择自然语言处理的模型,目前正在引入计算机视觉任务,如图像分类、物体探测和语义分割等。然而,在人造图估测领域,变形结构仍然占主导地位。在这项工作中,我们介绍了PoseFormer, 这是一种纯粹以变压器为基础的方法,用于在视频中进行3D人造图估测,而没有涉及变动结构。受视觉变压器最近发展情况的启发,我们设计了一个时空变异器结构,全面模拟每个框架内部的人类联合关系以及跨框架的时间相关性,然后输出一个准确的3D人造图。我们在两个流行和标准的基准数据集:Human3.6M和MPI-INF-3DHP上,我们从数量和质量上评估了我们的方法。广泛的实验显示,PoseFormer在两个数据集上都取得了状态和艺术表现。代码可在\url{http://github.com/zzwh/Posemer}

0

相关内容

估计/估计量

估计/估计量

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition

Arxiv

4+阅读 · 2020年12月18日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

Fine-Grained Head Pose Estimation Without Keypoints

Arxiv

5+阅读 · 2018年4月13日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation

Arxiv

3+阅读 · 2017年12月28日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

【论文推荐】最新6篇目标检测（Object Detection）相关论文—物体链接、手机端、三维地图、航空图像、检测与姿态估计

专知

8+阅读 · 2018年2月5日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition

Arxiv

4+阅读 · 2020年12月18日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

Fine-Grained Head Pose Estimation Without Keypoints

Arxiv

5+阅读 · 2018年4月13日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation

Arxiv

3+阅读 · 2017年12月28日

微信扫码咨询专知VIP会员