Localization has been a challenging task for autonomous navigation. A loop detection algorithm must overcome environmental changes for the place recognition and re-localization of robots. Therefore, deep learning has been extensively studied for the consistent transformation of measurements into localization descriptors. Street view images are easily accessible; however, images are vulnerable to appearance changes. LiDAR can robustly provide precise structural information. However, constructing a point cloud database is expensive, and point clouds exist only in limited places. Different from previous works that train networks to produce shared embedding directly between the 2D image and 3D point cloud, we transform both data into 2.5D depth images for matching. In this work, we propose a novel cross-matching method, called (LC)$^2$, for achieving LiDAR localization without a prior point cloud map. To this end, LiDAR measurements are expressed in the form of range images before matching them to reduce the modality discrepancy. Subsequently, the network is trained to extract localization descriptors from disparity and range images. Next, the best matches are employed as a loop factor in a pose graph. Using public datasets that include multiple sessions in significantly different lighting conditions, we demonstrated that LiDAR-based navigation systems could be optimized from image databases and vice versa.
翻译:定位是自主导航的一项具有挑战性的任务。循环检测算法必须克服环境变化以实现机器人的地点识别和重新定位。因此,深度学习已经广泛研究如何将测量值一致地转换为定位描述符。街景图片易于访问; 但是,图像容易受到外观变化的影响。激光雷达可以提供精确的结构信息,但是构建激光点云数据库代价昂贵,并且点云仅存在于有限的地方。与以前直接训练网络以在2D图像和3D点云之间产生共享嵌入的工作不同,我们将两种数据转换为2.5D深度图像进行匹配。在这项工作中,我们提出了一种新的交叉匹配方法(LC)$^2$,用于在没有先前点云地图的情况下实现激光雷达定位。为此,将激光测量表示为区域图像进行匹配以减少模态差异。然后,训练网络从视差和范围图像中提取定位描述符。接下来,在位姿图中将最佳匹配用作循环因子。使用包括明显不同照明条件下的多个会话的公共数据集,我们证明了可以从图像数据库中优化激光雷达导航系统,反之亦然。