Deep learning-based blind super-resolution (SR) methods have recently achieved unprecedented performance in upscaling frames with unknown degradation. These models are able to accurately estimate the unknown downscaling kernel from a given low-resolution (LR) image in order to leverage the kernel during restoration. Although these approaches have largely been successful, they are predominantly image-based and therefore do not exploit the temporal properties of the kernels across multiple video frames. In this paper, we investigated the temporal properties of the kernels and highlighted its importance in the task of blind video super-resolution. Specifically, we measured the kernel temporal consistency of real-world videos and illustrated how the estimated kernels might change per frame in videos of varying dynamicity of the scene and its objects. With this new insight, we revisited previous popular video SR approaches, and showed that previous assumptions of using a fixed kernel throughout the restoration process can lead to visual artifacts when upscaling real-world videos. In order to counteract this, we tailored existing single-image and video SR techniques to leverage kernel consistency during both kernel estimation and video upscaling processes. Extensive experiments on synthetic and real-world videos show substantial restoration gains quantitatively and qualitatively, achieving the new state-of-the-art in blind video SR and underlining the potential of exploiting kernel temporal consistency.
翻译:这些模型能够准确估计一个特定低分辨率图像中未知的下缩内核内核,以便在恢复期间利用内核。虽然这些方法大体上是成功的,但主要是图像法,因此没有在多个视频框中利用内核的时间特性。在本文件中,我们调查了内核的时间特性,并强调了其在盲视视频超级分辨率任务中的重要性。具体地说,我们测量了真实世界视频视频的内核时间内核一致性,并展示了在图像及其对象的动态性能各不相同的视频中,估计的内核在每个框架中都可能发生变化。有了这种新的洞察,我们重新审视了以前流行的视频内核方法,并表明,在恢复过程中使用固定内核内核的假设,在提升真实世界视频时,可导致视觉文物。为了抵消这一点,我们调整了现有单一图像和视频内核技术,以在合成内核内核影像的深度和图像分析过程中,在对智能内核图像的深度分析中,在恢复性图像的深度分析中,在恢复性图像的深度分析中,实现新的图像的深度分析过程中,并进行实质性的深度分析。