While deep learning-based methods for blind face restoration have achieved unprecedented success, they still suffer from two major limitations. First, most of them deteriorate when facing complex degradations out of their training data. Second, these methods require multiple constraints, e.g., fidelity, perceptual, and adversarial losses, which require laborious hyper-parameter tuning to stabilize and balance their influences. In this work, we propose a novel method named DifFace that is capable of coping with unseen and complex degradations more gracefully without complicated loss designs. The key of our method is to establish a posterior distribution from the observed low-quality (LQ) image to its high-quality (HQ) counterpart. In particular, we design a transition distribution from the LQ image to the intermediate state of a pre-trained diffusion model and then gradually transmit from this intermediate state to the HQ target by recursively applying a pre-trained diffusion model. The transition distribution only relies on a restoration backbone that is trained with $L_2$ loss on some synthetic data, which favorably avoids the cumbersome training process in existing methods. Moreover, the transition distribution can contract the error of the restoration backbone and thus makes our method more robust to unknown degradations. Comprehensive experiments show that DifFace is superior to current state-of-the-art methods, especially in cases with severe degradations. Our code and model are available at https://github.com/zsyOAOA/DifFace.
翻译:虽然深层次的以学习为基础的盲人脸部恢复方法取得了前所未有的成功,但它们仍然受到两大限制。 首先,当培训数据出现复杂的退化时,这些方法大多会恶化。 其次,这些方法需要多种限制,例如忠诚、感知和对抗性损失,这需要艰苦的超参数调整,以稳定并平衡其影响。在这项工作中,我们提议了一个名为DifFace的新方法,它能够更优雅地应对看不见的和复杂的退化,而没有复杂的损失设计。我们方法的关键在于从观察到的低质量(LQ)图像到高质量(HQ)对应图像的后端分布。特别是,我们设计了从LQ图像到事先经过训练的传播模型的中间状态的过渡分布,然后通过反复应用事先经过训练的传播模式,逐渐将这种状态传送到HQ。过渡分布仅仅依靠一个以$L_2美元的价格损失来训练的恢复骨架。 某些合成数据可以避免现有方法中繁琐的培训过程。此外,我们设计的LQO/A的过渡分布, 特别是从我们目前的快速的退化方法,可以证明我们目前的底部的复原方法是我们现在的错误。