Super-Resolution (SR) is a critical task in computer vision, focusing on reconstructing high-resolution (HR) images from low-resolution (LR) inputs. The field has seen significant progress through various challenges, particularly in single-image SR. Video Super-Resolution (VSR) extends this to the temporal domain, aiming to enhance video quality using methods like local, uni-, bi-directional propagation, or traditional upscaling followed by restoration. This challenge addresses VSR for conferencing, where LR videos are encoded with H.265 at fixed QPs. The goal is to upscale videos by a specific factor, providing HR outputs with enhanced perceptual quality under a low-delay scenario using causal models. The challenge included three tracks: general-purpose videos, talking head videos, and screen content videos, with separate datasets provided by the organizers for training, validation, and testing. We open-sourced a new screen content dataset for the SR task in this challenge. Submissions were evaluated through subjective tests using a crowdsourced implementation of the ITU-T Rec P.910.
翻译:超分辨率(SR)是计算机视觉领域的一项关键任务,其核心在于从低分辨率(LR)输入中重建高分辨率(HR)图像。通过各类挑战赛,该领域已取得显著进展,尤其是在单图像超分辨率方面。视频超分辨率(VSR)将此任务扩展至时域,旨在通过局部、单向、双向传播或先传统上采样后修复等方法提升视频质量。本次挑战赛聚焦于视频会议场景下的VSR,其中LR视频使用H.265编码并以固定量化参数(QP)进行压缩。目标是在低延迟场景下,使用因果模型将视频按指定倍数上采样,提供具有更高感知质量的HR输出。挑战赛包含三个赛道:通用视频、人像头部视频和屏幕内容视频,组织方为训练、验证和测试分别提供了独立的数据集。我们为此挑战赛中的超分辨率任务开源了一个新的屏幕内容数据集。提交结果通过采用众包方式实现的ITU-T Rec P.910标准进行主观测试评估。