Recent work has shown impressive localization performance using only images of ground textures taken with a downward facing monocular camera. This provides a reliable navigation method that is robust to feature sparse environments and challenging lighting conditions. However, these localization methods require an existing map for comparison. Our work aims to relax the need for a map by introducing a full simultaneous localization and mapping (SLAM) system. By not requiring an existing map, setup times are minimized and the system is more robust to changing environments. This SLAM system uses a combination of several techniques to accomplish this. Image keypoints are identified and projected into the ground plane. These keypoints, visual bags of words, and several threshold parameters are then used to identify overlapping images and revisited areas. The system then uses robust M-estimators to estimate the transform between robot poses with overlapping images and revisited areas. These optimized estimates make up the map used for navigation. We show, through experimental data, that this system performs reliably on many ground textures, but not all.
翻译:最近的工作显示,仅使用以向下面的单筒照相机拍摄的地面质地图图像,就具有令人印象深刻的本地化性性能。这提供了一种可靠的导航方法,能够以稀少的环境和具有挑战性的照明条件为特点。然而,这些本地化方法需要现有的地图进行比较。我们的工作旨在通过引入一个完全同步的本地化和绘图系统(SLAM)来放松对地图的需求。通过不要求现有的地图,设置时间最小化,系统对变化的环境更强大。这个系统使用多种技术来实现这一目标。图像关键点被识别并投射到地面平面上。这些关键点、视觉的文字袋和几个阈值参数随后被用来识别重叠的图像和重新审视的区域。系统随后使用强大的M估计器来估计机器人与重叠的图像和重新审视的区域之间的变化。这些优化估计值可以构成用于导航的地图。我们通过实验数据显示,这个系统在许多地面图质上运行可靠,但并非全部。</s>