3D 重塑当地规模统一单声带视频深度的场景 (Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth) - 专知论文

会员服务 ·

0

缩放 · 稳健性 · Performer · 估计/估计量 · state-of-the-art ·

2022 年 4 月 22 日

Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth

翻译：3D 重塑当地规模统一单声带视频深度的场景

Guangkai Xu,Wei Yin,Hao Chen,Kai Cheng,Feng Zhao,Chunhua Shen

from arxiv, 22 pages

Existing monocular depth estimation methods have achieved excellent robustness in diverse scenes, but they can only retrieve affine-invariant depth, up to an unknown scale and shift. However, in some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency. To solve this problem, we propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points, which ensures the scale consistency along consecutive frames. Extensive experiments show that our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks. Besides, we merge over 6.3 million RGBD images to train strong and robust depth models. Our produced ResNet50-backbone model even outperforms the state-of-the-art DPT ViT-Large model. Combining with geometry-based reconstruction methods, we formulate a new dense 3D scene reconstruction pipeline, which benefits from both the scale consistency of sparse points and the robustness of monocular methods. By performing the simple per-frame prediction over a video, the accurate 3D scene shape can be recovered.

翻译：现有单体深度估计方法在不同场景中达到了极强的稳健度,但它们只能在不同的场景中取得超强的稳健度,达到未知的规模和变化。然而,在视频深度估计和三维场景重建等一些基于视频的情景中,根据每框架预测的未知规模和变化可能导致深度不一致。为了解决这个问题,我们提议了一种本地加权线性回归方法,以恢复规模和变化,同时使用非常稀少的锚点,确保连续框架的尺度一致性。广泛的实验表明,我们的方法最多可以在几个零点基准的基础上提高50%的现有最先进方法的性能。此外,我们合并了630万 RGBD 图像,以培养强大和稳健的深度模型。我们制作的ResNet50-backone模型甚至超过了先进的DPT Vit-Large模型。与基于几何测量的重建方法相结合,我们制定了一个新的密度为3D的场景重建管道,这得益于稀有点的规模一致性和单视距方法的稳健度。通过对图像进行简单的一幅图像的精确度预测,可以恢复。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

g-C3N4基异质结构体系可见光电催化降解微囊藻毒素行为、协同及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

化学机械研磨液高分子表面活性剂溶剂化效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

高稳定性有序介孔尖晶石AFe2O4(A=Zn,Cu,Co,Ni)的可控制备及可见光催化分解水制氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

花生Cu/Zn-SOD活性响应干旱胁迫的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

关联体系中电荷有序的原位同步辐射表征

国家自然科学基金

0+阅读 · 2011年12月31日

新型叶酸受体介导的靶向有序超分子膜层层自组装纳米超声造影微泡构建的研究

国家自然科学基金

0+阅读 · 2011年12月31日

功能化碳化硅基复合材料的界面调控与机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Model Generalization for Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月10日

Unsupervised Learning of 3D Scene Flow from Monocular Camera

Arxiv

0+阅读 · 2022年6月8日

3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose from Monocular Video

Arxiv

0+阅读 · 2022年6月8日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Shape, Light & Material Decomposition from Images using Monte Carlo Rendering and Denoising

Arxiv

0+阅读 · 2022年6月7日

Critical Regularizations for Neural Surface Reconstruction in the Wild

Arxiv

0+阅读 · 2022年6月7日

The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation

Arxiv

0+阅读 · 2022年6月7日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

估计/估计量

state-of-the-art

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《"无人机航母"原型平台》

扩散语言模型综述

《攻击场景描述形式化模型研究》

【博士论文】理解神经网络的训练动态：从局部优化轨迹与特征学习视角

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Towards Model Generalization for Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月10日

Unsupervised Learning of 3D Scene Flow from Monocular Camera

Arxiv

0+阅读 · 2022年6月8日

3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose from Monocular Video

Arxiv

0+阅读 · 2022年6月8日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Shape, Light & Material Decomposition from Images using Monte Carlo Rendering and Denoising

Arxiv

0+阅读 · 2022年6月7日

Critical Regularizations for Neural Surface Reconstruction in the Wild

Arxiv

0+阅读 · 2022年6月7日

The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation

Arxiv

0+阅读 · 2022年6月7日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

g-C3N4基异质结构体系可见光电催化降解微囊藻毒素行为、协同及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

化学机械研磨液高分子表面活性剂溶剂化效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

高稳定性有序介孔尖晶石AFe2O4(A=Zn,Cu,Co,Ni)的可控制备及可见光催化分解水制氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

花生Cu/Zn-SOD活性响应干旱胁迫的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

关联体系中电荷有序的原位同步辐射表征

国家自然科学基金

0+阅读 · 2011年12月31日

新型叶酸受体介导的靶向有序超分子膜层层自组装纳米超声造影微泡构建的研究

国家自然科学基金

0+阅读 · 2011年12月31日

功能化碳化硅基复合材料的界面调控与机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员