This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Common scaling factors for learned video super-resolution (VSR) do not go beyond factor 4. Missing information can be restored well in this region, especially in HR videos, where the high-frequency content mostly consists of texture details. The task in this challenge is to upscale videos with an extreme factor of 16, which results in more serious degradations that also affect the structural integrity of the videos. A single pixel in the low-resolution (LR) domain corresponds to 256 pixels in the high-resolution (HR) domain. Due to this massive information loss, it is hard to accurately restore the missing information. Track 1 is set up to gauge the state-of-the-art for such a demanding task, where fidelity to the ground truth is measured by PSNR and SSIM. Perceptually higher quality can be achieved in trade-off for fidelity by generating plausible high-frequency content. Track 2 therefore aims at generating visually pleasing results, which are ranked according to human perception, evaluated by a user study. In contrast to single image super-resolution (SISR), VSR can benefit from additional information in the temporal domain. However, this also imposes an additional requirement, as the generated frames need to be consistent along time.
翻译:本文回顾了与ECCV 2020 年度AIM 2020 研讨会相关的视频极端超分辨率挑战。学习视频超分辨率(VSR)的共同缩放因素不超过要素4。由于这一巨大的信息损失,很难准确地恢复缺失的信息。在这一地区,特别是在高频内容大多包含纹理细节的HR视频中,缺失的信息可以很好地恢复,高频内容大多由纹理细节组成。这项挑战的任务是以极端因素16来提升视频,从而导致影响视频结构完整性的更严重退化。低分辨率(LR)域域的单一像素相当于高分辨率(HR)域的256像素。由于这一巨大的信息损失,很难准确地恢复缺失的信息。第一轨道是用来测量这种要求很高的任务的艺术状态,即对地面真相的忠诚度由PSNR和SSIM来测量。显然,通过生成可信的高频内容,可以在交易中实现更高的质量。第二轨道旨在产生视觉满意度结果,根据用户的认知水平进行排序,并由用户研究来评估,准确恢复缺失的信息。第一轨道将测量这种要求的状态与单一分辨率框架的附加。