Most of the research effort on image-based place recognition is designed for urban environments. In bucolic environments such as natural scenes with low texture and little semantic content, the main challenge is to handle the variations in visual appearance across time such as illumination, weather, vegetation state or viewpoints. The nature of the variations is different and this leads to a different approach to describing a bucolic scene. We introduce a global image descriptor computed from its semantic and topological information. It is built from the wavelet transforms of the image semantic edges. Matching two images is then equivalent to matching their semantic edge descriptors. We show that this method reaches state-of-the-art image retrieval performance on two multi-season environment-monitoring datasets: the CMU-Seasons and the Symphony Lake dataset. It also generalises to urban scenes on which it is on par with the current baselines NetVLAD and DELF.
翻译:大多数基于图像的位置识别的研究工作都是为城市环境设计的。 在自然场景等自然场景中,如低质素和少量语义内容的自然场景,主要的挑战是如何处理不同时间的视觉外观变化,如照明、天气、植被状态或视角。 变化的性质不同, 导致描述一个基于图像的场景的方法不同。 我们从其语义和地形学信息中计算出一个全球图像描述符。 它是从图像语义边缘的波盘变中建起来的。 匹配两个图像就相当于匹配其语义边缘描述符。 我们显示,这种方法在两个多季节环境监测数据集( CMU- Seasons 和 Symphony Lake 数据集)上达到了最先进的图像检索性能: CMU- Season 和 Symphony Lake 数据集。 它还向与当前基线 NetVLAD和DELF 相匹配的城市场景进行概括。