Super-Resolution (SR) is a critical task in computer vision, focusing on reconstructing high-resolution (HR) images from low-resolution (LR) inputs. The field has seen significant progress through various challenges, particularly in single-image SR. Video Super-Resolution (VSR) extends this to the temporal domain, aiming to enhance video quality using methods like local, uni-, bi-directional propagation, or traditional upscaling followed by restoration. This challenge addresses VSR for conferencing, where LR videos are encoded with H.265 at fixed QPs. The goal is to upscale videos by a specific factor, providing HR outputs with enhanced perceptual quality under a low-delay scenario using causal models. The challenge included three tracks: general-purpose videos, talking head videos, and screen content videos, with separate datasets provided by the organizers for training, validation, and testing. We open-sourced a new screen content dataset for the SR task in this challenge. Submissions were evaluated through subjective tests using a crowdsourced implementation of the ITU-T Rec P.910.
翻译:超分辨率是计算机视觉领域的一项关键任务,其核心在于从低分辨率输入中重建高分辨率图像。通过各类挑战赛的推动,该领域已取得显著进展,尤其是在单图像超分辨率方面。视频超分辨率将此任务扩展至时域,旨在通过局部传播、单向传播、双向传播等运动补偿方法,或传统的先上采样后修复流程来提升视频质量。本次挑战赛聚焦于视频会议场景下的超分辨率任务,其中低分辨率视频均采用固定量化参数的 H.265 标准进行编码。目标是在低延迟场景下,使用因果模型将视频按指定倍数进行上采样,输出具有更高感知质量的高分辨率视频。本次挑战赛包含三个赛道:通用视频、人像说话视频以及屏幕内容视频,组委会为各赛道分别提供了用于训练、验证和测试的数据集。我们为此挑战赛中的超分辨率任务开源了一个新的屏幕内容数据集。所有提交结果均通过基于 ITU-T Rec P.910 标准众包实现的主观测试进行评估。