Super-resolution (SR) has been widely used to convert low-resolution legacy videos to high-resolution (HR) ones, to suit the increasing resolution of displays (e.g. UHD TVs). However, it becomes easier for humans to notice motion artifacts (e.g. motion judder) in HR videos being rendered on larger-sized display devices. Thus, broadcasting standards support higher frame rates for UHD (Ultra High Definition) videos (4K@60 fps, 8K@120 fps), meaning that applying SR only is insufficient to produce genuine high quality videos. Hence, to up-convert legacy videos for realistic applications, not only SR but also video frame interpolation (VFI) is necessitated. In this paper, we first propose a joint VFI-SR framework for up-scaling the spatio-temporal resolution of videos from 2K 30 fps to 4K 60 fps. For this, we propose a novel training scheme with a multi-scale temporal loss that imposes temporal regularization on the input video sequence, which can be applied to any general video-related task. The proposed structure is analyzed in depth with extensive experiments.
翻译:超分辨率(SR)已被广泛用于将低分辨率遗留视频转换为高分辨率(HR)视频,以适应日益增强的显示分辨率(如UHD电视),然而,人类更容易注意到在大型显示装置上提供的HR视频中的运动文物(如运动judder),因此,广播标准支持UHD(Ultra高定义)视频(4K@60fps,8K@120fps)的更高框架率,这意味着只应用SR不足以产生真正的高质量视频。因此,不仅需要将传统视频升级用于现实应用,而且需要视频框架的内插。在本文中,我们首先提议一个VFI-SR联合框架,用于将视频的时空分辨率从2K30fps提高到4K60fps。为此,我们提议了一个具有多尺度时间损失的新培训计划,对输入视频序列进行时间调整,可以应用于任何一般与视频有关的任务。我们提出的结构将深入地分析。