RGB2Hands:用单式RGB视频实时跟踪3D手互动 (RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video) - 专知论文

会员服务 ·

0

INTERACT · Performer · 3D · Extensibility · MoDELS ·

2021 年 6 月 22 日

RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video

翻译：RGB2Hands:用单式RGB视频实时跟踪3D手互动

Jiayi Wang,Franziska Mueller,Florian Bernard,Suzanne Sorli,Oleksandr Sotnychenko,Neng Qian,Miguel A. Otaduy,Dan Casas,Christian Theobalt

from arxiv, SIGGRAPH Asia 2020

Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together with newly proposed intra-hand relative depth and inter-hand distance maps. These predictions are subsequently used in a generative model fitting framework in order to estimate pose and shape parameters of a 3D hand model for both hands. We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline through an extensive ablation study. Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions. Moreover, our method even performs on-par with depth-based real-time methods.

翻译：跟踪和重新构建互动中的三维面貌和两只手的几何结构是一个具有挑战性的问题,它与包括AR/VR、机器人或手语识别在内的若干人体计算机互动应用高度相关。现有的工程要么局限于更简单的跟踪设置(例如只考虑一只手或两只空间分离的手),要么依赖不太普遍的传感器,例如深度摄像头。与此形成对照,我们在此工作中提出了第一个实时方法,用于运动捕获骨骼面和三维表面手的地表几何方法,从一个明确考虑密切互动的一RGB相机中提取出一个三维手的深度。为了解决RGB数据内在深度模糊的问题,我们提议了一个新型多任务CNN,它会倒退多种互补的信息,包括分解、与三维手模型的密集匹配,以及2D关键点位置,以及新提出的内部相对深度和相距距离地图。这些预测随后被用于一个基于基因化模型的准确框架,以便估算两只手的三维手模型的形状和形状。我们甚至没有实验性地核查了两部的交互式互动方法的个体互动,我们用两种方法来进行实地追踪,我们目前使用的RGB的实地和数量分析。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

泡泡机器人SLAM

4+阅读 · 2019年1月14日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

【泡泡一分钟】利用多相机系统实现鲁棒的视觉里程计

【泡泡一分钟】利用多相机系统实现鲁棒的视觉里程计

泡泡机器人SLAM

6+阅读 · 2018年12月31日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

极市平台

4+阅读 · 2018年4月19日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

【泡泡一分钟】结合地图变形信息的惯性RGBD-SLAM系统（IROS-9）

【泡泡一分钟】结合地图变形信息的惯性RGBD-SLAM系统（IROS-9）

泡泡机器人SLAM

9+阅读 · 2018年3月20日

2017-最全手势识别/跟踪相关资源大列表分享（论文、数据集、比赛等）

2017-最全手势识别/跟踪相关资源大列表分享（论文、数据集、比赛等）

深度学习与NLP

64+阅读 · 2017年10月29日

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Arxiv

0+阅读 · 2021年8月23日

Monocular Real-time Full Body Capture with Inter-part Correlations

Monocular Real-time Full Body Capture with Inter-part Correlations

Arxiv

9+阅读 · 2020年12月11日

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Arxiv

5+阅读 · 2019年3月8日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Monocular Object and Plane SLAM in Structured Environments

Monocular Object and Plane SLAM in Structured Environments

Arxiv

12+阅读 · 2018年9月10日

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

Arxiv

4+阅读 · 2018年7月5日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

7+阅读 · 2017年12月26日

VIP会员

文章信息

相关主题

相关VIP内容

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

【泡泡一分钟】基于3D激光雷达地图的立体相机定位

泡泡机器人SLAM

4+阅读 · 2019年1月14日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

【泡泡一分钟】利用多相机系统实现鲁棒的视觉里程计

【泡泡一分钟】利用多相机系统实现鲁棒的视觉里程计

泡泡机器人SLAM

6+阅读 · 2018年12月31日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

极市平台

4+阅读 · 2018年4月19日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

【泡泡一分钟】结合地图变形信息的惯性RGBD-SLAM系统（IROS-9）

【泡泡一分钟】结合地图变形信息的惯性RGBD-SLAM系统（IROS-9）

泡泡机器人SLAM

9+阅读 · 2018年3月20日

2017-最全手势识别/跟踪相关资源大列表分享（论文、数据集、比赛等）

2017-最全手势识别/跟踪相关资源大列表分享（论文、数据集、比赛等）

深度学习与NLP

64+阅读 · 2017年10月29日

相关论文

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Arxiv

0+阅读 · 2021年8月23日

Monocular Real-time Full Body Capture with Inter-part Correlations

Monocular Real-time Full Body Capture with Inter-part Correlations

Arxiv

9+阅读 · 2020年12月11日

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Arxiv

5+阅读 · 2019年3月8日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Arxiv

4+阅读 · 2018年12月4日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Monocular Object and Plane SLAM in Structured Environments

Monocular Object and Plane SLAM in Structured Environments

Arxiv

12+阅读 · 2018年9月10日

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

Arxiv

4+阅读 · 2018年7月5日

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv

3+阅读 · 2018年3月16日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

7+阅读 · 2017年12月26日

微信扫码咨询专知VIP会员