Given a pair of partially overlapping source and target images and a keypoint in the source image, the keypoint's correspondent in the target image can be either visible, occluded or outside the field of view. Local feature matching methods are only able to identify the correspondent's location when it is visible, while humans can also hallucinate its location when it is occluded or outside the field of view through geometric reasoning. In this paper, we bridge this gap by training a network to output a peaked probability distribution over the correspondent's location, regardless of this correspondent being visible, occluded, or outside the field of view. We experimentally demonstrate that this network is indeed able to hallucinate correspondences on pairs of images captured in scenes that were not seen at training-time. We also apply this network to an absolute camera pose estimation problem and find it is significantly more robust than state-of-the-art local feature matching-based competitors.
翻译:鉴于源图像中存在部分重叠源和目标图像以及一个关键点,目标图像中的关键点记者可以是可见的、隐蔽的或外观的。 本地特征匹配方法只有在可见时才能辨别记者的位置, 而人类也可以通过几何推理在视野外产生幻觉。 在本文中, 我们通过培训一个网络来弥补这一差距, 以在记者的位置上输出一个最高峰的概率分布, 不论这位记者是可见的、 隐蔽的或视野外的。 我们实验性地证明这个网络确实能够在培训时看不到的场景中捕捉到的图像配对上的幻觉通信。 我们还将这个网络应用到绝对的相机上, 造成估计问题, 并且发现它比最先进的本地特征匹配竞争者要强得多。