Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts on the standard CUFED5 benchmark and also boosts the performance of video SR by incorporating the C2-Matching component into Video SR pipelines.
翻译:最近,基于参考的超级分辨率(Ref2-SR)已成为一个大有希望的范例,通过引入额外的高分辨率(HR)参考图像来加强低分辨率(LR)输入图像或视频。现有的 Ref-SR方法主要依靠隐含的对应来借用参考图像中的HR纹理,以弥补输入图像中的信息损失。然而,由于输入图像和参考图像之间的两个差距,进行本地传输十分困难:变异差距(例如,规模和旋转)和分辨率差距(例如,人力资源和LR)。为了应对这些挑战,我们提议在这项工作中进行C2-M匹配,以进行明确的中分辨率匹配交叉转换和分辨率。 为了缩小变异差距,我们建议建立一个对比式通信网络,利用更多对输入图像的视图来学习变动-罗巴通信。 2 为了解决解决方案差距,我们采用了教师-图象相关性蒸馏法,从更简单的HR图像匹配中提取知识,以指导更模糊的LRRR-HR比值比值匹配。 最后,我们还设计一个动态汇总模块模块模块,用以应对C-SR可能错的图像比值问题,将Slestiming图像和HR比值的缩缩缩缩缩缩缩缩缩缩缩显示。