Visual relocalization aims to estimate the pose of a camera from one or more images. In recent years deep learning based pose regression methods have attracted many attentions. They feature predicting the absolute poses without relying on any prior built maps or stored images, making the relocalization very efficient. However, robust relocalization under environments with complex appearance changes and real dynamics remains very challenging. In this paper, we propose to enhance the distinctiveness of the image features by extracting the deep relationship among objects. In particular, we extract objects in the image and construct a deep object relation graph (ORG) to incorporate the semantic connections and relative spatial clues of the objects. We integrate our ORG module into several popular pose regression models. Extensive experiments on various public indoor and outdoor datasets demonstrate that our method improves the performance significantly and outperforms the previous approaches.
翻译:视觉重新定位的目的是从一个或多个图像中估计相机的外形。 近年来深层学习的后退法吸引了许多注意力。 其特征是预测绝对外形,不依赖任何先前建造的地图或存储的图像,使重新定位非常有效。 但是,在外观变化和真实动态复杂的环境中,强势地重新定位仍然非常困难。 在本文中,我们提议通过提取物体之间的深层关系来提高图像特征的独特性。 特别是, 我们从图像中提取物体, 并构建一个深物体关系图( ORG), 以纳入物体的语义连接和相对空间线索。 我们将我们的ORG模块整合到几个流行的后退模型中。 在各种公共户外数据集上进行的广泛实验表明,我们的方法大大改进了性能,并超越了以往的方法。