We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models to conditional image generation and performs super-resolution through a stochastic denoising process. Inference starts with pure Gaussian noise and iteratively refines the noisy output using a U-Net model trained on denoising at various noise levels. SR3 exhibits strong performance on super-resolution tasks at different magnification factors, on faces and natural images. We conduct human evaluation on a standard 8X face super-resolution task on CelebA-HQ, comparing with SOTA GAN methods. SR3 achieves a fool rate close to 50%, suggesting photo-realistic outputs, while GANs do not exceed a fool rate of 34%. We further show the effectiveness of SR3 in cascaded image generation, where generative models are chained with super-resolution models, yielding a competitive FID score of 11.3 on ImageNet.
翻译:我们提出SR3, 这是一种通过重复精炼来图像超分辨率的方法。 SR3 将分解扩散概率模型与有条件的图像生成相适应,并通过一个随机分解过程执行超分辨率模型。 推论从纯高山噪音开始,利用经过在不同噪音级别进行分解培训的U-Net模型对噪音输出进行迭代完善。 SR3 在不同放大系数、面部和自然图像上,在超分辨率任务上表现出很强的性能。 我们根据标准8X对西莱巴- 赫克的超分辨率任务进行人类评估,与SOTA GAN方法进行比较。 SR3 达到接近50%的愚笨率, 显示照片现实主义产出, 而GANs不超过34%的愚笨率。 我们进一步展示了S3在连带图像生成中的有效性, 配有超分辨率模型的基因化模型与超分辨率模型相连接, 在图像网络上产生竞争性的FID分11.3。