As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to-sequence modeling. However, the training of these models is not only costly but also relatively hard to converge, with gradient exploding and vanishing problems. To cope with these problems, we proposed a two-stage framework including a multi-frame recurrent network and a single-frame transformer. Besides, multiple training strategies, such as transfer learning and progressive training, are developed to shorten the training time and improve the model performance. Benefiting from the above technical contributions, our solution wins two champions and a runner-up in the NTIRE 2022 super-resolution and quality enhancement of compressed video challenges. Code is available at https://github.com/ryanxingql/winner-ntire22-vqe.
翻译:作为广泛研究的一项任务,视频恢复的目的是提高具有多种潜在降解的视频的质量,如噪音、模糊和压缩制品;在视频修复中,压缩视频质量增强和视频超分辨率是两个在实际情景中具有重要价值的主要图案;最近,由于经常神经网络和变压器在序列到序列模型方面的能力令人印象深刻,因此在这一领域吸引了越来越多的研究兴趣;然而,这些模型的培训不仅费用高昂,而且相对难以合并,梯度会爆炸和消失。为处理这些问题,我们提出了一个两阶段框架,包括多框架经常网络和单一框架变异器。此外,还制定了多项培训战略,如转移学习和逐步培训,以缩短培训时间,改进模型性能。从上述技术贡献中受益,我们的解决方案在2022年国家综合培训网超级分辨率和压缩视频挑战的质量提高方面赢得了两位冠军和一位跑步。代码见https://github.com/ryanxl/winner-stire22-qve。