While deep learning-based methods for blind face restoration have achieved unprecedented success, they still suffer from two major limitations. First, most of them deteriorate when facing complex degradations out of their training data. Second, these methods require multiple constraints, e.g., fidelity, perceptual, and adversarial losses, which require laborious hyper-parameter tuning to stabilize and balance their influences. In this work, we propose a novel method named DifFace that is capable of coping with unseen and complex degradations more gracefully without complicated loss designs. The key of our method is to establish a posterior distribution from the observed low-quality (LQ) image to its high-quality (HQ) counterpart. In particular, we design a transition distribution from the LQ image to the intermediate state of a pre-trained diffusion model and then gradually transmit from this intermediate state to the HQ target by recursively applying a pre-trained diffusion model. The transition distribution only relies on a restoration backbone that is trained with $L_2$ loss on some synthetic data, which favorably avoids the cumbersome training process in existing methods. Moreover, the transition distribution can contract the error of the restoration backbone and thus makes our method more robust to unknown degradations. Comprehensive experiments show that DifFace is superior to current state-of-the-art methods, especially in cases with severe degradations. Code and model are available at https://github.com/zsyOAOA/DifFace.
翻译:虽然基于深度学习的盲人面部恢复方法取得了前所未有的成功,但它们仍面临着两个主要限制。首先,它们中的大多数都会在面对超出其训练数据的复杂退化时退化。其次,这些方法需要多个约束条件,例如保真度、感知性和对抗性损失,这些条件需要繁琐的超参数调整来稳定和平衡它们的影响。在这项工作中,我们提出了一种名为DifFace的新方法,这种方法无需复杂的损失设计就能更优雅地处理未知和复杂的退化。我们方法的关键是建立一个从观察到的低质量(LQ)图像到其高质量(HQ)对应物的后验分布。具体而言,我们设计了一种从LQ图像到预训练扩散模型的中间状态的过渡分布,然后通过递归地应用预训练的扩散模型,从这个中间状态逐渐传输到HQ目标。过渡分布仅依赖于使用$L_2$损失在一些合成数据上训练的恢复骨干,这有利于避免现有方法中繁琐的训练过程。此外,过渡分布可以收缩恢复骨干的误差,从而使我们的方法对未知的退化更具鲁棒性。综合实验表明,DifFace优于当前最先进的方法,特别是在严重退化的情况下。代码和模型可在https://github.com/zsyOAOA/DifFace上获得。