Camera relocalization is the key component of simultaneous localization and mapping (SLAM) systems. This paper proposes a learning-based approach, named Sparse Spatial Scene Embedding with Graph Neural Networks (S3E-GNN), as an end-to-end framework for efficient and robust camera relocalization. S3E-GNN consists of two modules. In the encoding module, a trained S3E network encodes RGB images into embedding codes to implicitly represent spatial and semantic embedding code. With embedding codes and the associated poses obtained from a SLAM system, each image is represented as a graph node in a pose graph. In the GNN query module, the pose graph is transformed to form a embedding-aggregated reference graph for camera relocalization. We collect various scene datasets in the challenging environments to perform experiments. Our results demonstrate that S3E-GNN method outperforms the traditional Bag-of-words (BoW) for camera relocalization due to learning-based embedding and GNN powered scene matching mechanism.
翻译:相机重新定位是同步本地化和绘图系统( SLAM) 的关键组成部分。 本文提出一种学习方法, 名为 Sprasse Space Speene 嵌入图形神经网络( S3E- GNNN), 作为高效且稳健的相机重新定位的端到端框架。 S3E- GNN 由两个模块组成。 在编码模块中, 训练有素的 S3E 网络将 RGB 图像编码为嵌入代码, 以隐含空间和语义嵌入代码。 随着嵌入代码和从 SLAM 系统中获取的相关映入, 每张图像在图像图示图中都作为图形节点。 在 GNN 查询模块中, 图像图被转换成一个嵌入集集参考图, 用于相机重新定位。 我们收集了在具有挑战性的环境中进行实验的各种场景数据集。 我们的结果表明, S3E- GNNN 方法超越了基于学习嵌入和 GNN 电源场景匹配机制的摄影机的传统的“ Bag- 字组” (BoW) 。