As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to-sequence modeling. However, the training of these models is not only costly but also relatively hard to converge, with gradient exploding and vanishing problems. To cope with these problems, we proposed a two-stage framework including a multi-frame recurrent network and a single-frame transformer. Besides, multiple training strategies, such as transfer learning and progressive training, are developed to shorten the training time and improve the model performance. Benefiting from the above technical contributions, our solution wins two champions and a runner-up in the NTIRE 2022 super-resolution and quality enhancement of compressed video challenges.
翻译:作为广泛研究的一项任务,视频恢复的目的是提高视频质量,使其具有多种潜在退化,如噪音、模糊和压缩制品等。在视频修复中,压缩视频质量增强和视频超分辨率是两个在实际情景中具有重要价值的主要工具。最近,由于经常神经网络和变压器具有按顺序建模的令人印象深刻的能力,因此在这一领域吸引了越来越多的研究兴趣。然而,这些模型的培训不仅费用高昂,而且相对难以合并,梯度会爆炸和消失。为处理这些问题,我们提出了一个两阶段框架,包括多框架经常网络和单一框架变压器。此外,还制定了多项培训战略,如转移学习和逐步培训,以缩短培训时间,改进模型性能。从上述技术贡献中受益,我们的解决方案在2022年国家电视和电视电子解析的超级解析中赢得了两位冠军和一位副手,并在质量上提高了压缩视频挑战的质量。