Retrieval-based place recognition is an efficient and effective solution for enabling re-localization within a pre-built map or global data association for Simultaneous Localization and Mapping (SLAM). The accuracy of such an approach is heavily dependent on the quality of the extracted scene-level representation. While end-to-end solutions, which learn a global descriptor from input point clouds, have demonstrated promising results, such approaches are limited in their ability to enforce desirable properties at the local feature level. In this paper, we demonstrate that the inclusion of an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits, hence leading to more repeatable global descriptors resulting in an overall improvement in place recognition performance. We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net. Experiments on two large-scale public benchmarks (KITTI and MulRan) show that our method achieves mean $F1_{max}$ scores of $0.939$ and $0.968$ on KITTI and MulRan, respectively while operating in near real-time.
翻译:以检索为基础的地点识别是一个高效而有效的解决办法,有助于在一个预建的地图或全球数据协会内重新定位,促进同步本地化和绘图(SLAM),这种方法的准确性在很大程度上取决于提取现场代表的质量。虽然从输入点云中学习全球描述器的端到端解决方案已经显示出有希望的结果,但这类方法在当地地物层面执行适当地物的能力有限。在本文件中,我们证明增加一个培训信号(地方一致性损失)可以指导网络学习不同重访之间一致的地方性特征,从而导致更可重复的全球标本,从而全面改进定位性表现。我们用一个端到端的可培训架构制定我们的方法,称为LoGG3D-Net。关于两个大规模公共基准(KITTI和MulRan)的实验表明,我们的方法在接近实时运行时,在KITTI和MulRan上分别达到0.939美元和0.968美元的分数。