In a Simultaneous Localization and Mapping (SLAM) system, a loop-closure can eliminate accumulated errors, which is accomplished by Visual Place Recognition (VPR), a task that retrieves the current scene from a set of pre-stored sequential images through matching specific scene-descriptors. In urban scenes, the appearance variation caused by seasons and illumination has brought great challenges to the robustness of scene descriptors. Semantic segmentation images can not only deliver the shape information of objects but also their categories and spatial relations that will not be affected by the appearance variation of the scene. Innovated by the Vector of Locally Aggregated Descriptor (VLAD), in this paper, we propose a novel image descriptor with aggregated semantic skeleton representation (SSR), dubbed SSR-VLAD, for the VPR under drastic appearance-variation of environments. The SSR-VLAD of one image aggregates the semantic skeleton features of each category and encodes the spatial-temporal distribution information of the image semantic information. We conduct a series of experiments on three public datasets of challenging urban scenes. Compared with four state-of-the-art VPR methods- CoHOG, NetVLAD, LOST-X, and Region-VLAD, VPR by matching SSR-VLAD outperforms those methods and maintains competitive real-time performance at the same time.
翻译:在Simultaneous本地化和绘图系统(SLAM)中,循环闭合可以消除累积错误,由视觉定位识别(VLAD)完成,通过匹配特定的场景描述器,从一组预储存的相继图像中检索到当前场景。在城市场景中,季节和光化造成的外观变化给场景描述器的稳健性带来了巨大挑战。语义分解图像不仅可以提供物体的形状信息,而且可以提供不会受到场景外貌变化影响的物体的类别和空间关系。在视觉定位识别(VPR)中,由本地集成的本地集成描述器(VPRDSD)粉丝(VLAD)创新了当前场景,我们为VPR在环境外观急剧变异的情况下,为VPR造成外观变化,为VRV-VAD提供新的图像描述符标示器,我们用三种公共阵列、VRVLDS-S-RDS-S-RAD 和VADR-RADR-R-RADR-ROD-RAD-RADR-R-RAD-RV-RAD-RADR-R-R-RAD-RDRDRDRDR-R-RDRDR-RDRDRDRD-R-R-R-R-R-R-RD-R-R-R-R-R-R-RD-RD-R-R-RD-RD-RD-RD-RD-R-R-R-R-RD-R-R-R-RD-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-RD-RD-RD-RD-RD-RD-RD-RD-R-R-R-RD-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-