This paper introduces a novel unsupervised neural network model for visual information encoding which aims to address the problem of large-scale visual localization. Inspired by the structure of the visual cortex, the model (namely HSD) alternates layers of topologic sparse coding and pooling to build a more compact code of visual information. Intended for visual place recognition (VPR) systems that use local descriptors, the impact of its integration in a bio-inpired model for self-localization (LPMP) is evaluated. Our experimental results on the KITTI dataset show that HSD improves the runtime speed of LPMP by a factor of at least 2 and its localization accuracy by 10%. A comparison with CoHog, a state-of-the-art VPR approach, showed that our method achieves slightly better results.
翻译:本文为视觉信息编码引入了一个新的、不受监督的神经网络模型,目的是解决大规模视觉本地化问题。在视觉皮层结构的启发下,该模型(即HSD)替代了多层次的视觉稀疏编码和集合,以构建更加紧凑的视觉信息代码。用于使用本地描述符的视觉定位识别系统(VPR)系统,评估其融入生物吸收型自我本地化模型(LPMP)的影响。我们对KITTI数据集的实验结果表明,HSD将LPMP的运行时间速度提高至少2倍,其本地化精确度提高10%。与最先进的VPR方法CoHog的比较表明,我们的方法取得了略好的结果。