Sequential matching using hand-crafted heuristics has been standard practice in route-based place recognition for enhancing pairwise similarity results for nearly a decade. However, precision-recall performance of these algorithms dramatically degrades when searching on short temporal window (TW) lengths, while demanding high compute and storage costs on large robotic datasets for autonomous navigation research. Here, influenced by biological systems that robustly navigate spacetime scales even without vision, we develop a joint visual and positional representation learning technique, via a sequential process, and design a learning-based CNN+LSTM architecture, trainable via backpropagation through time, for viewpoint- and appearance-invariant place recognition. Our approach, Sequential Place Learning (SPL), is based on a CNN function that visually encodes an environment from a single traversal, thus reducing storage capacity, while an LSTM temporally fuses each visual embedding with corresponding positional data -- obtained from any source of motion estimation -- for direct sequential inference. Contrary to classical two-stage pipelines, e.g., match-then-temporally-filter, our network directly eliminates false-positive rates while jointly learning sequence matching from a single monocular image sequence, even using short TWs. Hence, we demonstrate that our model outperforms 15 classical methods while setting new state-of-the-art performance standards on 4 challenging benchmark datasets, where one of them can be considered solved with recall rates of 100% at 100% precision, correctly matching all places under extreme sunlight-darkness changes. In addition, we show that SPL can be up to 70x faster to deploy than classical methods on a 729 km route comprising 35,768 consecutive frames. Extensive experiments demonstrate the... Baseline code available at https://github.com/mchancan/deepseqslam
翻译:使用手工制作的螺旋论进行序列法匹配是近十年来基于路由地点的识别标准做法,目的是提高双向相似性结果。然而,这些算法的精确回召性能在短时间窗口(TW)长度搜索时会急剧下降,同时要求为自主导航研究对大型机器人数据集进行高计算和存储成本。这里,受强力导航甚至没有视觉的空间尺度的生物系统影响,我们开发了一个视觉和定位联合学习技术,通过一个顺序过程,并设计了一个学习的CNN+LSTM结构,通过时间的后方对精度精确度进行回溯性分析,以便进行视觉和外观性能分析。我们的方法“序列学习”基于CNN的功能,将一个环境从单轨运行中加密,从而降低存储能力,而一个LSTM在时间上将每个视觉嵌入相应的定位模型 -- 从任何运动估算来源获得的定位模型 -- 直直线顺序推断。 与经典的两台管道的直径直线转换结构的架构结构结构,例如,甚至匹配的直径直径直径直径直径直路路路路路路路路路路路速度速度速度速度速度速度速度速度速度速度,然后用直达速度速度速度速度速度路路段路路路路路路路段路段路段路段路段,我们的方法,同时用直行的连接,同时将一个直线路段路段路段路段路段路段路段路段路段路段路段路路段路段路段路段路段路路路路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段连接路段连接路段路段路段路段路段路段路段路段路段路段路段路段连接路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段路段