单体: 单体 3D 语义场景补全 (MonoScene: Monocular 3D Semantic Scene Completion) - 专知论文

会员服务 ·

0

3D · 推断 · HTTPS · MoDELS · 数据集 ·

2021 年 12 月 1 日

MonoScene: Monocular 3D Semantic Scene Completion

翻译：单体: 单体 3D 语义场景补全

Anh-Quan Cao,Raoul de Charette

MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene

翻译：MonoScene 提出了一个 3D 语义场景补全框架( SSC) 3D 。在该框架中,一个场景的密度几何学和语义从一个单眼 RGB 图像中推断出来。与 SSC 文献不同, 我们依靠2.5 或 3D 输入, 解决了 2D 至 3D 场景重建的复杂问题, 同时共同推断出其语义。我们的框架依靠相继的 2D 和 3D UNets 相继连接的2D 和 3D UNets 。我们的框架依靠来自光学的小说 2D-3D 特征投影, 并引入了 3D 上下文, 在强制执行 spotio- semanic 一致性之前。除了建筑贡献外, 我们还引入了新的全球场景和局部条形体损失。实验显示我们超越了所有仪表和数据集的文献, 而在摄影场外。我们的代码和训练有素的模型可以在 https://github. com/ cv- rits/ Mon- trits/ MonSene 。

1

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

275+阅读 · 2020年2月13日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

现在开源的RGB-D SLAM有哪些?

现在开源的RGB-D SLAM有哪些?

计算机视觉life

31+阅读 · 2019年5月8日

CVPR2019| 04-24更新12篇论文及代码（位姿估计/自动驾驶/GAN/图像生成等）

CVPR2019| 04-24更新12篇论文及代码（位姿估计/自动驾驶/GAN/图像生成等）

极市平台

11+阅读 · 2019年4月24日

【泡泡图灵智库】自动驾驶中的基于立体视觉的3D语义物体和相机运动追踪（ECCV）

【泡泡图灵智库】自动驾驶中的基于立体视觉的3D语义物体和相机运动追踪（ECCV）

泡泡机器人SLAM

10+阅读 · 2019年4月18日

【泡泡一分钟】高动态环境的语义单目SLAM

【泡泡一分钟】高动态环境的语义单目SLAM

泡泡机器人SLAM

5+阅读 · 2019年3月27日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

【泡泡一分钟】3D物体的特征编码变种

【泡泡一分钟】3D物体的特征编码变种

泡泡机器人SLAM

4+阅读 · 2019年1月1日

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

泡泡机器人SLAM

6+阅读 · 2018年12月18日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

Boosting Monocular Depth Estimation with Sparse Guided Points

Arxiv

0+阅读 · 2022年2月3日

Training Semantic Descriptors for Image-Based Localization

Arxiv

0+阅读 · 2022年2月2日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Bridging Knowledge Graphs to Generate Scene Graphs

Bridging Knowledge Graphs to Generate Scene Graphs

Arxiv

5+阅读 · 2020年1月7日

SE-SLAM: Semi-Dense Structured Edge-Based Monocular SLAM

SE-SLAM: Semi-Dense Structured Edge-Based Monocular SLAM

Arxiv

3+阅读 · 2019年9月9日

Trajectory Prediction by Coupling Scene-LSTM with Human Movement LSTM

Arxiv

4+阅读 · 2019年8月23日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Two Stream 3D Semantic Scene Completion

Two Stream 3D Semantic Scene Completion

Arxiv

4+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

相关VIP内容

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

275+阅读 · 2020年2月13日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

现在开源的RGB-D SLAM有哪些?

现在开源的RGB-D SLAM有哪些?

计算机视觉life

31+阅读 · 2019年5月8日

CVPR2019| 04-24更新12篇论文及代码（位姿估计/自动驾驶/GAN/图像生成等）

CVPR2019| 04-24更新12篇论文及代码（位姿估计/自动驾驶/GAN/图像生成等）

极市平台

11+阅读 · 2019年4月24日

【泡泡图灵智库】自动驾驶中的基于立体视觉的3D语义物体和相机运动追踪（ECCV）

【泡泡图灵智库】自动驾驶中的基于立体视觉的3D语义物体和相机运动追踪（ECCV）

泡泡机器人SLAM

10+阅读 · 2019年4月18日

【泡泡一分钟】高动态环境的语义单目SLAM

【泡泡一分钟】高动态环境的语义单目SLAM

泡泡机器人SLAM

5+阅读 · 2019年3月27日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

【泡泡一分钟】3D物体的特征编码变种

【泡泡一分钟】3D物体的特征编码变种

泡泡机器人SLAM

4+阅读 · 2019年1月1日

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

泡泡机器人SLAM

6+阅读 · 2018年12月18日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

相关论文

Boosting Monocular Depth Estimation with Sparse Guided Points

Arxiv

0+阅读 · 2022年2月3日

Training Semantic Descriptors for Image-Based Localization

Arxiv

0+阅读 · 2022年2月2日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Bridging Knowledge Graphs to Generate Scene Graphs

Bridging Knowledge Graphs to Generate Scene Graphs

Arxiv

5+阅读 · 2020年1月7日

SE-SLAM: Semi-Dense Structured Edge-Based Monocular SLAM

SE-SLAM: Semi-Dense Structured Edge-Based Monocular SLAM

Arxiv

3+阅读 · 2019年9月9日

Trajectory Prediction by Coupling Scene-LSTM with Human Movement LSTM

Arxiv

4+阅读 · 2019年8月23日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Two Stream 3D Semantic Scene Completion

Two Stream 3D Semantic Scene Completion

Arxiv

4+阅读 · 2018年7月16日

微信扫码咨询专知VIP会员