Image keypoint extraction is an important step for visual localization. The localization in indoor environment is challenging for that there may be many unreliable features on dynamic or repetitive objects. Such kind of reliability cannot be well learned by existing Convolutional Neural Network (CNN) based feature extractors. We propose a novel network, RaP-Net, which explicitly addresses feature invariability with a region-wise predictor, and combines it with a point-wise predictor to select reliable keypoints in an image. We also build a new dataset, OpenLORIS-Location, to train this network. The dataset contains 1553 indoor images with location labels. There are various scene changes between images on the same location, which can help a network to learn the invariability in typical indoor scenes. Experimental results show that the proposed RaP-Net trained with the OpenLORIS-Location dataset significantly outperforms existing CNN-based keypoint extraction algorithms for indoor localization. The code and data are available at https://github.com/ivipsourcecode/RaP-Net.
翻译:图像点提取是视觉本地化的一个重要步骤。 室内环境的本地化具有挑战性, 因为在动态或重复对象上可能有许多不可靠的特性。 这种可靠性无法由现有基于进化神经网络的特征提取器来很好地学习。 我们提议了一个新颖的网络, RaP- Net, 它明确处理与区域预测器的不可变性特征, 并且将其与点向预测器结合起来, 在图像中选择可靠的关键点。 我们还建立了一个新的数据集, OpenLORIS- Location, 来训练这个网络。 该数据集包含1553个带有位置标签的室内图像。 同一地点的图像之间有各种场景变化, 这有助于网络在典型的室内场景中学习不可变性。 实验结果表明, 与OpenLORIS- Location数据集培训的RaP- Net 大大超越了现有基于CNN的室内本地化关键点提取算法。 代码和数据可在 https://github. com/ ivippointcode/ RaP-Net Net 上查阅 。