Recognising previously visited locations is an important, but unsolved, task in autonomous navigation. Current visual place recognition (VPR) benchmarks typically challenge models to recover the position of a query image (or images) from sequential datasets that include both spatial and temporal components. Recently, Echo State Network (ESN) varieties have proven particularly powerful at solving machine learning tasks that require spatio-temporal modelling. These networks are simple, yet powerful neural architectures that--exhibiting memory over multiple time-scales and non-linear high-dimensional representations--can discover temporal relations in the data while still maintaining linearity in the learning time. In this paper, we present a series of ESNs and analyse their applicability to the VPR problem. We report that the addition of ESNs to pre-processed convolutional neural networks led to a dramatic boost in performance in comparison to non-recurrent networks in five out of six standard benchmarks (GardensPoint, SPEDTest, ESSEX3IN1, Oxford RobotCar, and Nordland), demonstrating that ESNs are able to capture the temporal structure inherent in VPR problems. Moreover, we show that models that include ESNs can outperform class-leading VPR models which also exploit the sequential dynamics of the data. Finally, our results demonstrate that ESNs improve generalisation abilities, robustness, and accuracy further supporting their suitability to VPR applications.
翻译:在自主导航中,承认以前访问过的地点是一项重要但尚未解决的任务。当前视觉位置识别基准(VPR)通常对从包含空间和时间组成部分的连续数据集中恢复查询图像(或图像)位置的模式提出挑战。最近,回声国家网络(ESN)品种在解决需要时空建模的机器学习任务方面被证明特别强大。这些网络是简单而强大的神经结构,在多个时间尺度和非线性高显示中抑制记忆,同时发现数据的时间关系,同时在学习时仍保持线性。在本文件中,我们提出了一系列ESN(或图像),分析其对VPR问题的可适用性。我们报告说,将ESN(ES)品种添加到预先处理的革命性神经网络后,导致在六个标准基准(GardensPoint、SPEDTest、ESEX3IN1、牛津机器人公司和诺德兰)中的非经常性网络的运行率大幅提高。我们可以看到,ENS(EPR)能够进一步捕捉摸到VPR(VPR)中固有的时间结构结构应用。我们最后展示了EPR(SIN)的正确性模型。