Establishing dense correspondences between a pair of images is an important and general problem, covering geometric matching, optical flow and semantic correspondences. While these applications share fundamental challenges, such as large displacements, pixel-accuracy, and appearance changes, they are currently addressed with specialized network architectures, designed for only one particular task. This severely limits the generalization capabilities of such networks to new scenarios, where e.g. robustness to larger displacements or higher accuracy is required. In this work, we propose a universal network architecture that is directly applicable to all the aforementioned dense correspondence problems. We achieve both high accuracy and robustness to large displacements by investigating the combined use of global and local correlation layers. We further propose an adaptive resolution strategy, allowing our network to operate on virtually any input image resolution. The proposed GLU-Net achieves state-of-the-art performance for geometric and semantic matching as well as optical flow, when using the same network and weights. Code and trained models are available at https://github.com/PruneTruong/GLU-Net.
翻译:在一对图像之间建立密集的对应关系是一个重要和普遍的问题,涉及几何匹配、光学流和语义通信。这些应用虽然具有一些共同的基本挑战,例如大规模迁移、像素精确度和外观变化等,但目前只用专门网络结构来解决这些问题,只为一项特定任务设计。这严重限制了这类网络的普及能力,使之适应新的情景,例如,对更大的迁移或更高的准确性需要有强健性。在这项工作中,我们提出一个可直接适用于上述所有密集通信问题的通用网络结构。我们通过调查全球和地方相关层的合并使用,实现了对大规模迁移的高度准确性和稳健性。我们进一步提出了适应性解决方案战略,允许我们的网络在几乎所有输入图像分辨率上运行。拟议的GLU-Net在使用相同的网络和重量时,实现了测量和语系匹配以及光学流的状态。代码和经过培训的模型见https://github.com/PruneTruong/GLU-Net。