Place recognition plays an essential role in the field of autonomous driving and robot navigation. Although a number of point cloud based methods have been proposed and achieved promising results, few of them take the size difference of objects into consideration. For small objects like pedestrians and vehicles, large receptive fields will capture unrelated information, while small receptive fields would fail to encode complete geometric information for large objects such as buildings. We argue that fixed receptive fields are not well suited for place recognition, and propose a novel Adaptive Receptive Field Module (ARFM), which can adaptively adjust the size of the receptive field based on the input point cloud. We also present a novel network architecture, named TransLoc3D, to obtain discriminative global descriptors of point clouds for the place recognition task. TransLoc3D consists of a 3D sparse convolutional module, an ARFM module, an external transformer network which aims to capture long range dependency and a NetVLAD layer. Experiments show that our method outperforms prior state-of-the-art results, with an improvement of 1.1\% on average recall@1 on the Oxford RobotCar dataset, and 0.8\% on the B.D. dataset.
翻译:位置识别在自主驾驶和机器人导航领域发挥着必不可少的作用。 虽然已经提出了一系列基于点云的计算方法并取得了令人乐观的结果, 但其中很少有人会考虑物体的大小差异。 对于行人和车辆等小物体,大型可接收字段将捕捉不相干的信息, 而小型可接收字段将无法为建筑物等大型物体编码完整的几何信息。 我们争辩说, 固定可接收字段并不适合于位置识别, 并提出一个新的适应性可调整根据输入点云调整可接收字段大小的适应性字段模块( ARFM ) 。 我们还提出了一个名为 TransLoc3D 的新网络结构, 以获取定位任务中点云的歧视性全球描述符。 TransLoc3D 包含一个3D 分散的革命模块、 ARFM 模块、 一个旨在捕捉远程依赖性的外部变压器网络, 以及一个 NetVLAD 层。 实验显示, 我们的方法超越了先前基于输入点云的状态结果, 并改进了在牛津机器人数据集上的平均回顾@1 和B 。