Visual place recognition (VPR) is a key issue for robotics and autonomous systems. For the trade-off between time and performance, most of methods use the coarse-to-fine hierarchical architecture, which consists of retrieving top-N candidates using global features, and re-ranking top-N with local features. However, since the two types of features are usually processed independently, re-ranking may harm global retrieval, termed re-ranking confusion. Moreover, re-ranking is limited by global retrieval. In this paper, we propose a tightly coupled learning (TCL) strategy to train triplet models. Different from original triplet learning (OTL) strategy, it combines global and local descriptors for joint optimization. In addition, a bidirectional search dynamic time warping (BS-DTW) algorithm is also proposed to mine locally spatial information tailored to VPR in re-ranking. The experimental results on public benchmarks show that the models using TCL outperform the models using OTL, and TCL can be used as a general strategy to improve performance for weakly supervised ranking tasks. Further, our lightweight unified model is better than several state-of-the-art methods and has over an order of magnitude of computational efficiency to meet the real-time requirements of robots.
翻译:视觉位置识别(VPR)是机器人和自主系统的一个关键问题。对于时间和性能之间的权衡,大多数方法都使用粗到软的等级结构,其中包括使用全球特征重新获取顶级候选人,以及用本地特征重新排名顶级。然而,由于这两类特征通常是独立处理的,因此重新排序可能会损害全球检索,称为重新排序的混乱。此外,重新排序受到全球检索的限制。在本文中,我们提议了一项密切配合的学习(TCL)战略来培训三重模型。它不同于最初的三重学习(OTL)战略,它结合了全球和地方的标本,以便联合优化。此外,还提议采用双向搜索动态时间扭曲(BS-DTW)算法,以便根据VPR的重新排位调整当地空间信息。关于公共基准的实验结果显示,使用TCL的模式比使用OTL的模型要优于模型,TCL可以用来作为改进低监管排名任务绩效的一般战略。此外,我们的轻度统一型实际效率模型(BS)比数级标准要好得多。