Blind Face Restoration (BFR) aims to construct a high-quality (HQ) face image from its corresponding low-quality (LQ) input. Recently, many BFR methods have been proposed and they have achieved remarkable success. However, these methods are trained or evaluated on privately synthesized datasets, which makes it infeasible for the subsequent approaches to fairly compare with them. To address this problem, we first synthesize two blind face restoration benchmark datasets called EDFace-Celeb-1M (BFR128) and EDFace-Celeb-150K (BFR512). State-of-the-art methods are benchmarked on them under five settings including blur, noise, low resolution, JPEG compression artifacts, and the combination of them (full degradation). To make the comparison more comprehensive, five widely-used quantitative metrics and two task-driven metrics including Average Face Landmark Distance (AFLD) and Average Face ID Cosine Similarity (AFICS) are applied. Furthermore, we develop an effective baseline model called Swin Transformer U-Net (STUNet). The STUNet with U-net architecture applies an attention mechanism and a shifted windowing scheme to capture long-range pixel interactions and focus more on significant features while still being trained efficiently. Experimental results show that the proposed baseline method performs favourably against the SOTA methods on various BFR tasks.
翻译:盲人脸部恢复(BFR)的目标是从相应的低质量(LQ)投入中构建一个高质量的(HQ)脸部图像。最近,提出了许多BFR方法,这些方法取得了显著的成功。然而,这些方法在私人合成数据集方面得到了培训或评价,使得随后采用的方法无法与这些数据集进行公平比较。为了解决这一问题,我们首先综合了两个盲人脸部恢复基准数据集,称为EDFace-Celeb-1M(BFR128)和EDFace-Celeb-150K(BFR512)。最近,许多BFRM(BFR512)提出了许多最新方法,这些方法在五个环境下,包括模糊、噪音、低分辨率、JPEG压缩工艺和这些数据集的组合(全面退化),取得了显著的成功。要使比较更加全面使用的五种定量指标和两种任务驱动的衡量标准,包括平均脸部地标距离(AFLDRD)和平均面光学相似性(AFICS)。此外,我们开发了一个有效的基线模型模型模型,称为Swin tranger U-Net(STUNet)。STUNet-TA,而STUN-Teet-FROdestranging the Sloomisalmaislations