Existing pipelines of semantic correspondence commonly include extracting high-level semantic features for the invariance against intra-class variations and background clutters. This architecture, however, inevitably results in a low-resolution matching field that additionally requires an ad-hoc interpolation process as a post-processing for converting it into a high-resolution one, certainly limiting the overall performance of matching results. To overcome this, inspired by recent success of implicit neural representation, we present a novel method for semantic correspondence, called Neural Matching Field (NeMF). However, complicacy and high-dimensionality of a 4D matching field are the major hindrances, which we propose a cost embedding network to process a coarse cost volume to use as a guidance for establishing high-precision matching field through the following fully-connected network. Nevertheless, learning a high-dimensional matching field remains challenging mainly due to computational complexity, since a naive exhaustive inference would require querying from all pixels in the 4D space to infer pixel-wise correspondences. To overcome this, we propose adequate training and inference procedures, which in the training phase, we randomly sample matching candidates and in the inference phase, we iteratively performs PatchMatch-based inference and coordinate optimization at test time. With these combined, competitive results are attained on several standard benchmarks for semantic correspondence. Code and pre-trained weights are available at https://ku-cvlab.github.io/NeMF/.
翻译:语义通信的现有管道通常包括提取高层次的语义特征,以适应阶级内部变异和背景混杂,然而,这一结构不可避免地导致低分辨率匹配字段,这还需要一个临时混合的内插过程,作为后处理过程,将其转换成高分辨率的管道,当然限制了匹配结果的总体性能。为了克服这一点,在隐含神经代表最近的成功启发下,我们提出了一个新的语义通信方法,称为神经匹配场(NeMF)。然而,4D匹配字段的兼容性和高度多维度是主要障碍,我们提议建立一个成本嵌入网络,处理粗略的成本量,作为通过以下完全连接的网络建立高精度匹配字段的指导。然而,学习高维度匹配字段仍然具有挑战性,这主要是由于计算的复杂性,因为一个天真详尽的推论将需要从4D空间的所有像素查询,到推导出精度通信。为了克服这一障碍,我们提议在测试阶段进行适当的培训和优化测试候选人。