In this paper, we propose a dense monocular SLAM system, named DeepRelativeFusion, that is capable to recover a globally consistent 3D structure. To this end, we use a visual SLAM algorithm to reliably recover the camera poses and semi-dense depth maps of the keyframes, and then combine the keyframe pose-graph with the densified keyframe depth maps to reconstruct the scene. To improve the semi-dense depth maps, we propose an adaptive filtering scheme, which is a structure-preserving weighted average smoothing filter that takes into account the pixel intensity and depth of the neighbouring pixels, yielding substantial reconstruction accuracy gain in densification. To perform densification, we introduce two incremental improvements upon the energy minimization framework proposed by DeepFusion: (1) an improved cost function, and (2) the use of single-image relative depth prediction. Moreover, we show that the relative depth maps can be corrected and are sufficiently accurate to be used as priors for the densification. To demonstrate the generalizability of relative depth prediction, we illustrate qualitatively the dense reconstruction on two outdoor sequences. Our system also outperforms the state-of-the-art dense SLAM systems quantitatively in dense reconstruction accuracy by a large margin.
翻译:在本文中,我们提出一个叫做DeepRelativeFusion的密集的单心镜SLMM系统,这个系统能够恢复一个全球一致的3D结构。为此,我们使用视觉SLM算法,可靠地恢复摄像头的配置和关键框架的半高级深度地图,然后将关键框架布景图与精密关键框架深度地图相结合,以重建现场。为了改进半高级深度地图,我们提议了一个适应性过滤系统,这是一个结构保持加权平均平滑过滤器,考虑到相邻像素的像素强度和深度,从而在密度化方面产生大量的重建精度收益。为了进行密度化,我们引入了“深福素”提出的能源最小化框架的两种递增改进:(1) 一个改进的成本功能,和(2) 使用单像相对深度的预测来重建现场。此外,我们表明相对深度的地图可以被纠正,并且足够精确地用作密度的预置。为了显示相对深度预测的广度,我们用两个室的密度序列对密度进行定性的重新定性。我们的系统也通过高密度的精确度系统来显示高密度的精确度系统。