3DP3: 3D 通过概率方案编制的感知 (3DP3: 3D Scene Perception via Probabilistic Programming) - 专知论文

会员服务 ·

0

3D · 推断 · 图 · Better · 泛化理论 ·

2021 年 10 月 30 日

3DP3: 3D Scene Perception via Probabilistic Programming

翻译：3DP3: 3D 通过概率方案编制的感知

Nishad Gothoskar,Marco Cusumano-Towner,Ben Zinberg,Matin Ghavamizadeh,Falk Pollok,Austin Garrett,Joshua B. Tenenbaum,Dan Gutfreund,Vikash K. Mansinghka

from arxiv, NeurIPS 2021

We present 3DP3, a framework for inverse graphics that uses inference in a structured generative model of objects, scenes, and images. 3DP3 uses (i) voxel models to represent the 3D shape of objects, (ii) hierarchical scene graphs to decompose scenes into objects and the contacts between them, and (iii) depth image likelihoods based on real-time graphics. Given an observed RGB-D image, 3DP3's inference algorithm infers the underlying latent 3D scene, including the object poses and a parsimonious joint parametrization of these poses, using fast bottom-up pose proposals, novel involutive MCMC updates of the scene graph structure, and, optionally, neural object detectors and pose estimators. We show that 3DP3 enables scene understanding that is aware of 3D shape, occlusion, and contact structure. Our results demonstrate that 3DP3 is more accurate at 6DoF object pose estimation from real images than deep learning baselines and shows better generalization to challenging scenes with novel viewpoints, contact, and partial observability.

翻译：我们展示了3DP3, 一个用于在物体、场景和图像结构化的基因模型中进行推断的反向图形框架。 3DP3使用 (一) voxel 模型来代表物体的三维形状, (二) 将场景分解成物体及其之间接触的等级场景图, (三) 基于实时图形的深度图像可能性。根据观察到的 RGB-D 图像, 3DP3 的推论算算算法将潜潜潜潜3D 场景, 包括对象的外形和这些外形的相近性共同对称, 使用快速自下而上式的配置图案建议, 新的不挥发式MCMMC 模型来代表三维物体的形状结构, 以及可选的神经对象探测器和形状测量器。我们显示 3DP3 能够使场景了解了解3D 形状、封闭度和接触结构。我们的结果表明, 6DoF 对象的 3DP3比深层次的基线更精确地显示从真实图像中进行估计, 并显示以新观点、接触和部分可观测到挑战性的场景。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【布朗大学】从像素到建筑物:用于大规模语义映射的端到端的概率深度网络（From Pixels to Buildings: End-to-end Probabilistic Deep Networks for Large-scale Semantic Mapping）

【布朗大学】从像素到建筑物:用于大规模语义映射的端到端的概率深度网络（From Pixels to Buildings: End-to-end Probabilistic Deep Networks for Large-scale Semantic Mapping）

专知会员服务

7+阅读 · 2019年12月22日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

跟踪SLAM前沿动态系列之ICCV2019

跟踪SLAM前沿动态系列之ICCV2019

泡泡机器人SLAM

7+阅读 · 2019年11月23日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 29篇目标检测相关论文汇总（部分含源码）

CVPR2019 | 29篇目标检测相关论文汇总（部分含源码）

极市平台

82+阅读 · 2019年5月6日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【ECCV2018】24篇论文代码实现

【ECCV2018】24篇论文代码实现

专知

17+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Self-supervised Geometric Perception

Arxiv

24+阅读 · 2021年3月4日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Efficient Tracking Proposals using 2D-3D Siamese Networks on LIDAR

Arxiv

4+阅读 · 2019年3月25日

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Arxiv

5+阅读 · 2019年3月8日

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Arxiv

7+阅读 · 2018年12月11日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

3D-LaneNet: end-to-end 3D multiple lane detection

3D-LaneNet: end-to-end 3D multiple lane detection

Arxiv

7+阅读 · 2018年11月26日

Two Stream 3D Semantic Scene Completion

Two Stream 3D Semantic Scene Completion

Arxiv

4+阅读 · 2018年7月16日

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

Arxiv

4+阅读 · 2018年7月5日

VIP会员

文章信息

相关主题

相关VIP内容

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【布朗大学】从像素到建筑物:用于大规模语义映射的端到端的概率深度网络（From Pixels to Buildings: End-to-end Probabilistic Deep Networks for Large-scale Semantic Mapping）

【布朗大学】从像素到建筑物:用于大规模语义映射的端到端的概率深度网络（From Pixels to Buildings: End-to-end Probabilistic Deep Networks for Large-scale Semantic Mapping）

专知会员服务

7+阅读 · 2019年12月22日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

跟踪SLAM前沿动态系列之ICCV2019

跟踪SLAM前沿动态系列之ICCV2019

泡泡机器人SLAM

7+阅读 · 2019年11月23日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 29篇目标检测相关论文汇总（部分含源码）

CVPR2019 | 29篇目标检测相关论文汇总（部分含源码）

极市平台

82+阅读 · 2019年5月6日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【ECCV2018】24篇论文代码实现

【ECCV2018】24篇论文代码实现

专知

17+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Self-supervised Geometric Perception

Arxiv

24+阅读 · 2021年3月4日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Efficient Tracking Proposals using 2D-3D Siamese Networks on LIDAR

Arxiv

4+阅读 · 2019年3月25日

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

Arxiv

5+阅读 · 2019年3月8日

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Arxiv

7+阅读 · 2018年12月11日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

3D-LaneNet: end-to-end 3D multiple lane detection

3D-LaneNet: end-to-end 3D multiple lane detection

Arxiv

7+阅读 · 2018年11月26日

Two Stream 3D Semantic Scene Completion

Two Stream 3D Semantic Scene Completion

Arxiv

4+阅读 · 2018年7月16日

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

A Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking

Arxiv

4+阅读 · 2018年7月5日

微信扫码咨询专知VIP会员