This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network. While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting. In the majority of existing work, geometric model estimation problems such as homography estimation and piece-wise planar reconstruction (PPR) are usually solved by standard (greedy) RANSAC separately and sequentially. However, setting the inlier-outlier threshold is difficult in absence of information about the scene (i.e. the scale). In this work, we revisit these problems and argue that two mentioned geometric models (homographies/3D planes) can be solved by minimizing an energy function that exploits the spatial coherence, i.e. with graph-cut optimization, which also tackles the practical issue when the output of a trained CNN is inaccurate. Moreover, we propose an adaptive parameter setting strategy based on our experiments, and report a comprehensive evaluation on various open-source datasets.
翻译:本文展示了使用原样平面分割网络信号改进测算和绘图的语义平面SLMM系统。 主流方法正在使用 RGB-D 传感器,但使用带有这种系统的单镜照相机仍面临稳健的数据联系和精确的几何模型设计等挑战。 在大多数现有工作中,诸如同影估计和笔记式平面重建等几何模型估计问题通常通过标准( greedy) RANSAC 分别和顺序解决。然而,在缺乏关于现场( 比例) 的信息的情况下,很难设定离线阈值。 在这项工作中,我们重新审视了这些问题,并论证提到的两个几何模型( hographies/3D plane) 可以通过最大限度减少利用空间一致性的能源功能来解决, 即用图形- 优化, 在经过培训的CNN产出不准确时, 也解决实际问题。 此外,我们提出一个适应性参数设定战略,以我们的实验为基础, 并报告对各种公开源数据集的全面评估。