Monocular simultaneous localization and mapping (SLAM) is emerging in advanced driver assistance systems and autonomous driving, because a single camera is cheap and easy to install. Conventional monocular SLAM has two major challenges leading inaccurate localization and mapping. First, it is challenging to estimate scales in localization and mapping. Second, conventional monocular SLAM uses inappropriate mapping factors such as dynamic objects and low-parallax ares in mapping. This paper proposes an improved real-time monocular SLAM that resolves the aforementioned challenges by efficiently using deep learning-based semantic segmentation. To achieve the real-time execution of the proposed method, we apply semantic segmentation only to downsampled keyframes in parallel with mapping processes. In addition, the proposed method corrects scales of camera poses and three-dimensional (3D) points, using estimated ground plane from road-labeled 3D points and the real camera height. The proposed method also removes inappropriate corner features labeled as moving objects and low parallax areas. Experiments with six video sequences demonstrate that the proposed monocular SLAM system achieves significantly more accurate trajectory tracking accuracy compared to state-of-the-art monocular SLAM and comparable trajectory tracking accuracy compared to state-of-the-art stereo SLAM.
翻译:高级驾驶协助系统和自主驾驶中出现了高级驱动器协助系统和自动驾驶(SLAM),因为单一照相机价格低廉,容易安装; 常规单单式SLAM有两个主要挑战,导致不准确的本地化和绘图; 第一,在本地化和绘图中估计比例具有挑战性; 第二,常规单型SLAM使用不适当的绘图因素,如动态物体和低链轴在绘图中使用不适当的地面飞机,如动态物体和低链轴成像仪; 本文建议改进实时单式SLAM,通过使用深层学习的语义分解法有效解决上述挑战; 为了实现拟议方法的实时执行,我们只对下标的关键框架进行语义分解,与绘图进程平行; 此外,拟议方法纠正摄像器配置和三维点(3D)的比例,使用道路标签3D点和真实摄像高度的地面飞机估计数; 拟议的方法还消除了以移动物体和低链轴区域为标签的不适当的角落特征。 6个视频序列的实验表明,拟议的单式SLAM系统比州级、可比较的轨道跟踪更精确。