Transfusion: 用于 3D 人类粒子估测的交叉视图与变异器相融合 (TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · INFORMS · 变换 · 3D · 预测器/决策函数 ·

2021 年 10 月 18 日

TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation

翻译：Transfusion: 用于 3D 人类粒子估测的交叉视图与变异器相融合

Haoyu Ma,Liangjian Chen,Deying Kong,Zhe Wang,Xingwei Liu,Hao Tang,Xiangyi Yan,Yusheng Xie,Shih-Yao Lin,Xiaohui Xie

from arxiv, BMVC 2021. Code is available at: https://github.com/HowieMa/TransFusion-Pose

Estimating the 2D human poses in each view is typically the first step in calibrated multi-view 3D pose estimation. But the performance of 2D pose detectors suffers from challenging situations such as occlusions and oblique viewing angles. To address these challenges, previous works derive point-to-point correspondences between different views from epipolar geometry and utilize the correspondences to merge prediction heatmaps or feature representations. Instead of post-prediction merge/calibration, here we introduce a transformer framework for multi-view 3D pose estimation, aiming at directly improving individual 2D predictors by integrating information from different views. Inspired by previous multi-modal transformers, we design a unified transformer architecture, named TransFusion, to fuse cues from both current views and neighboring views. Moreover, we propose the concept of epipolar field to encode 3D positional information into the transformer model. The 3D position encoding guided by the epipolar field provides an efficient way of encoding correspondences between pixels of different views. Experiments on Human 3.6M and Ski-Pose show that our method is more efficient and has consistent improvements compared to other fusion methods. Specifically, we achieve 25.8 mm MPJPE on Human 3.6M with only 5M parameters on 256 x 256 resolution.

翻译：估计每个视图中的 2D 人姿势通常都是校准多视图 3D 3D 显示估计的第一步。但2D 显示探测器的性能存在挑战性的情况,例如隐蔽和倾斜的观察角度。为了应对这些挑战,以往的工程在上极地几何不同观点之间产生点对点对应,并利用对应法将3D 位置信息编码到变异模型中。这里我们引入了多视图 3D 显示估计的变异框架,目的是通过整合不同观点的信息,直接改进个人 2D 预测器。在以往多模式变异器的启发下,我们设计了一个统一的变异器结构,名为 TransFusion,以结合当前观点和相邻观点的导线。此外,我们提出了将3D 定位信息编码到变异器模型中的上。由子字段指导的 3D 位置编码为多种观点之间的编码提供了有效的方法。人类3. 3M 和 Ski-Pose 3D 预测器的实验由以前的多式变异的变码组成。在25M M 中,我们的方法上只有更高效和一致的方法。

0

相关内容

估计/估计量

估计/估计量

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【ICML2021】模仿学习的超参数选择

专知会员服务

22+阅读 · 2021年5月27日

【CVPR2021】LiDAR R-CNN：一种快速、通用的二阶段3D检测器

专知会员服务

16+阅读 · 2021年4月3日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

专知会员服务

25+阅读 · 2020年4月2日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于视频修复的时空转换网络

【泡泡一分钟】基于视频修复的时空转换网络

泡泡机器人SLAM

5+阅读 · 2018年12月30日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

机器学习研究会

6+阅读 · 2017年8月8日

PE-former: Pose Estimation Transformer

Arxiv

0+阅读 · 2021年12月9日

Fast Point Transformer

Arxiv

0+阅读 · 2021年12月9日

PTTR: Relational 3D Point Cloud Object Tracking with Transformer

Arxiv

0+阅读 · 2021年12月7日

Voxelized 3D Feature Aggregation for Multiview Detection

Arxiv

0+阅读 · 2021年12月7日

Joint 3D Human Shape Recovery and Pose Estimation from a Single Image with Bilayer Graph

Arxiv

0+阅读 · 2021年12月5日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

Arxiv

7+阅读 · 2020年3月30日

2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation

Arxiv

3+阅读 · 2017年12月28日

VIP会员

文章信息

相关主题

估计/估计量

预测器/决策函数

相关VIP内容

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【ICML2021】模仿学习的超参数选择

专知会员服务

22+阅读 · 2021年5月27日

【CVPR2021】LiDAR R-CNN：一种快速、通用的二阶段3D检测器

专知会员服务

16+阅读 · 2021年4月3日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

CVPR 2020 | MetaFuse：用于人体姿态估计的预训练信息融合模型

专知会员服务

25+阅读 · 2020年4月2日

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

【香港中文大学-CVPR2020】Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

专知会员服务

22+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于视频修复的时空转换网络

【泡泡一分钟】基于视频修复的时空转换网络

泡泡机器人SLAM

5+阅读 · 2018年12月30日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

【学习】CVPR 2017 Tutorial：如何从图像来构建3D模型

机器学习研究会

6+阅读 · 2017年8月8日

相关论文

PE-former: Pose Estimation Transformer

Arxiv

0+阅读 · 2021年12月9日

Fast Point Transformer

Arxiv

0+阅读 · 2021年12月9日

PTTR: Relational 3D Point Cloud Object Tracking with Transformer

Arxiv

0+阅读 · 2021年12月7日

Voxelized 3D Feature Aggregation for Multiview Detection

Arxiv

0+阅读 · 2021年12月7日

Joint 3D Human Shape Recovery and Pose Estimation from a Single Image with Bilayer Graph

Arxiv

0+阅读 · 2021年12月5日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

Arxiv

7+阅读 · 2020年3月30日

2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation

Arxiv

3+阅读 · 2017年12月28日

微信扫码咨询专知VIP会员