In this paper, we propose a dense monocular SLAM system, named DeepRelativeFusion, that is capable to recover a globally consistent 3D structure. To this end, we use a visual SLAM algorithm to reliably recover the camera poses and semi-dense depth maps of the keyframes, and then use relative depth prediction to densify the semi-dense depth maps and refine the keyframe pose-graph. To improve the semi-dense depth maps, we propose an adaptive filtering scheme, which is a structure-preserving weighted average smoothing filter that takes into account the pixel intensity and depth of the neighbouring pixels, yielding substantial reconstruction accuracy gain in densification. To perform densification, we introduce two incremental improvements upon the energy minimization framework proposed by DeepFusion: (1) an improved cost function, and (2) the use of single-image relative depth prediction. After densification, we update the keyframes with two-view consistent optimized semi-dense and dense depth maps to improve pose-graph optimization, providing a feedback loop to refine the keyframe poses for accurate scene reconstruction. Our system outperforms the state-of-the-art dense SLAM systems quantitatively in dense reconstruction accuracy by a large margin.
翻译:在本文中,我们提出一个叫DeepRelativeFusion的密集单心单心型SLMM系统,这个系统能够恢复全球一致的3D结构。为此,我们使用视觉SLM算法,可靠地恢复摄像头的配置和关键框架的半密度深度地图,然后使用相对深度预测,使半密度深度地图密度化,并改进键框架的布局。为了改进半剂量深度地图,我们提议了一个适应性过滤系统,这是一个结构保存加权平均平滑过滤器,能够考虑到相邻像素的像素强度和深度,从而在密度化中产生大量的重建精确精度收益。为了进行密度化,我们对DeepFusion提出的能源最小化框架采用了两种渐进式改进:(1) 改进的成本功能,和(2) 使用单象值相对深度预测。为了改进半密度深度地图,我们用两种视图更新了关键框架,即保持最优化的半密度和密度深度的深度地图,提供反馈回路环,以完善准确的定位配置场景结构配置。我们系统在高密度级的系统上比重度系统。