The key challenge in learning dense correspondences lies in the lack of ground-truth matches for real image pairs. While photometric consistency losses provide unsupervised alternatives, they struggle with large appearance changes, which are ubiquitous in geometric and semantic matching tasks. Moreover, methods relying on synthetic training pairs often suffer from poor generalisation to real data. We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. Our objective is effective even in settings with large appearance and view-point changes. Given a pair of real images, we first construct an image triplet by applying a randomly sampled warp to one of the original images. We derive and analyze all flow-consistency constraints arising between the triplet. From our observations and empirical results, we design a general unsupervised objective employing two of the derived constraints. We validate our warp consistency loss by training three recent dense correspondence networks for the geometric and semantic matching tasks. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS. Code and models are at github.com/PruneTruong/DenseMatching.
翻译:学习密度高的对应对应物的关键挑战在于缺乏真实图像配对的地面真相匹配。 虽然光度一致性损失提供了不受监督的替代物, 光度一致性损失提供了不受监督的替代物, 但它们在外观变化中挣扎着巨大的外观变化, 这些外观变化在几何和语义匹配任务中是普遍存在的。 此外, 依赖合成培训配对的方法往往缺乏对真实数据的概括性。 我们提出Warp Consisticent, 这是一种不受监督的学习目标, 即对于密集的对应物回归来说是一个不受监督的学习目标。 我们的目标是, 即使在外观和视觉变化巨大的情况下, 我们的目标也是有效的。 如果有一副真实图像, 我们首先通过对原始图像应用随机抽样的折叠来构建图像三合体。 我们从我们的观察和实验结果中, 设计出一个通用的、 不受监督的目标, 使用两种衍生的制约物。 我们通过培训最近三个密集的测量和语义匹配任务通信网络, 来验证我们的战争一致性损失。 我们的方法在几个挑战性基准上设置了新的状态, 包括MGDeptricar和 TSSD. CD. 和模型。