Monte Carlo 蒙特卡洛场景搜索 3D 场景理解 (Monte Carlo Scene Search for 3D Scene Understanding) - 专知论文

会员服务 ·

0

蒙特卡罗 · 可理解性 · RGB-D · 可约的 · 3D ·

2021 年 3 月 30 日

Monte Carlo Scene Search for 3D Scene Understanding

翻译：Monte Carlo 蒙特卡洛场景搜索 3D 场景理解

Shreyas Hampali,Sinisa Stekovic,Sayan Deb Sarkar,Chetan Srinivasa Kumar,Friedrich Fraundorfer,Vincent Lepetit

from arxiv, To be presented at CVPR 2021

We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need for training data. More exactly, we propose a modification of the Monte Carlo Tree Search (MCTS) algorithm to retrieve objects and room layouts from noisy RGB-D scans. While MCTS was developed as a game-playing algorithm, we show it can also be used for complex perception problems. Our adapted MCTS algorithm has few easy-to-tune hyperparameters and can optimise general losses. We use it to optimise the posterior probability of objects and room layout hypotheses given the RGB-D data. This results in an analysis-by-synthesis approach that explores the solution space by rendering the current solution and comparing it to the RGB-D observations. To perform this exploration even more efficiently, we propose simple changes to the standard MCTS' tree construction and exploration policy. We demonstrate our approach on the ScanNet dataset. Our method often retrieves configurations that are better than some manual annotations, especially on layouts.

翻译：我们探索如何用通用的 AI 算法来理解 3D 场景理解来减少对培训数据的需求。更确切地说, 我们提议修改蒙特卡洛树搜索算法, 以便从吵闹的 RGB- D 扫描中检索对象和房间布局。虽然 MCTS 是作为一种游戏播放算法而开发的, 但我们也显示它也可以用于复杂的认知问题。我们改编的 MCTS 算法没有多少容易调制的超光度参数, 并且可以优化一般损失。我们使用它来优化基于 RGB- D 数据的天体和房间布局假设的远端概率。这样做的结果是一种逐个分析的合成方法, 通过提供当前解决方案和将其与 RGB- D 观测比较来探索解决方案空间。为了更高效地进行这一探索, 我们建议简单修改标准 MCTS 树构造和勘探政策。我们在扫描网数据集上展示我们的方法。我们的方法常常检索比一些手动说明更好的配置, 特别是在布局上。

1

相关内容

蒙特卡罗

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

专知会员服务

29+阅读 · 2019年6月16日

已删除

将门创投

6+阅读 · 2019年4月10日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Residual Force Polytope: Admissible Task-Space Forces of Dynamic Trajectories

Arxiv

0+阅读 · 2021年5月25日

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

Arxiv

0+阅读 · 2021年5月25日

Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search

Arxiv

0+阅读 · 2021年5月23日

Flow-driven spectral chaos (FSC) method for long-time integration of second-order stochastic dynamical systems

Arxiv

0+阅读 · 2021年5月21日

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

Arxiv

4+阅读 · 2020年12月4日

Deep Learning for 3D Point Cloud Understanding: A Survey

Deep Learning for 3D Point Cloud Understanding: A Survey

Arxiv

10+阅读 · 2020年9月18日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

DPOD: Dense 6D Pose Object Detector in RGB images

DPOD: Dense 6D Pose Object Detector in RGB images

Arxiv

5+阅读 · 2019年2月28日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

相关VIP内容

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

MonoGRNet：单目3D目标检测的通用框架（TPAMI2021）

专知会员服务

18+阅读 · 2021年5月3日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

专知会员服务

29+阅读 · 2019年6月16日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

已删除

将门创投

6+阅读 · 2019年4月10日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Residual Force Polytope: Admissible Task-Space Forces of Dynamic Trajectories

Arxiv

0+阅读 · 2021年5月25日

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

Arxiv

0+阅读 · 2021年5月25日

Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search

Arxiv

0+阅读 · 2021年5月23日

Flow-driven spectral chaos (FSC) method for long-time integration of second-order stochastic dynamical systems

Arxiv

0+阅读 · 2021年5月21日

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

Arxiv

4+阅读 · 2020年12月4日

Deep Learning for 3D Point Cloud Understanding: A Survey

Deep Learning for 3D Point Cloud Understanding: A Survey

Arxiv

10+阅读 · 2020年9月18日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

DPOD: Dense 6D Pose Object Detector in RGB images

DPOD: Dense 6D Pose Object Detector in RGB images

Arxiv

5+阅读 · 2019年2月28日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

Arxiv

7+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员