Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames. The vital operation is to utilize the relative misaligned frames for the current frame reconstruction and preserve the consistency of the results. Existing methods generally explore information propagation and frame alignment to improve the performance of VSR. However, few studies focus on the temporal consistency of inter-frames. In this paper, we propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos. A spatio-temporal stability module is designed to learn the self-alignment from inter-frames. Especially, the correlative matching is employed to exploit the spatial dependency from each frame to maintain structural stability. Moreover, a self-attention mechanism is utilized to learn the temporal correspondence to implement an adaptive warping operation for temporal consistency among multi-frames. Besides, a hybrid recurrent architecture is designed to leverage short-term and long-term information. We further present a progressive fusion module to perform a multistage fusion of spatio-temporal features. And the final reconstructed frames are refined by these fused features. Objective and subjective results of various experiments demonstrate that TCNet has superior performance on different benchmark datasets, compared to several state-of-the-art methods.
翻译:视频超分辨率(VSR)是一项任务,目的是从低分辨率参考框架和多个相邻框架重建高分辨率(HR)框架,目的是从低分辨率参考框架和多个相邻框架重建高分辨率(HR)框架。关键操作是利用相对不匹配的框架来重建当前框架重建,并保持结果的一致性。现有方法一般地探索信息传播和框架调整,以改善VSR的性能。然而,很少有研究侧重于跨框架之间的时间一致性。在本文件中,我们提议以端对端的方式为VSR建立一个时间一致性学习网络(TCNet),以加强重建后的视频的一致性。一个时空稳定性模块的设计是为了从框架间重建中学习自我匹配的相对错误框架,并维护结果的一致性。特别是,利用现有的匹配方法来利用每个框架的空间依赖来保持结构的稳定性。此外,利用自控机制来学习时间对应性来实施适应性调调调和调的多框架之间的时间一致性操作。此外,还设计了一个混合的经常结构,以利用经过重建的视频视频的前后信息的一致性。我们进一步展示一个进步的CN同步组合模块,以多阶段性标准化的精确度标准化模型为各种标准。