In recent years, denoising diffusion models have demonstrated outstanding image generation performance. The information on natural images captured by these models is useful for many image reconstruction applications, where the task is to restore a clean image from its degraded observations. In this work, we propose a conditional sampling scheme that exploits the prior learned by diffusion models while retaining agreement with the observations. We then combine it with a novel approach for adapting pretrained diffusion denoising networks to their input. We examine two adaption strategies: the first uses only the degraded image, while the second, which we advocate, is performed using images that are ``nearest neighbors'' of the degraded image, retrieved from a diverse dataset using an off-the-shelf visual-language model. To evaluate our method, we test it on two state-of-the-art publicly available diffusion models, Stable Diffusion and Guided Diffusion. We show that our proposed `adaptive diffusion for image reconstruction' (ADIR) approach achieves a significant improvement in the super-resolution, deblurring, and text-based editing tasks.
翻译:近些年来,除去扩散模型已经展示出杰出的图像生成性能。这些模型所捕捉的自然图像信息对于许多图像重建应用很有用,这里的任务是从退化的观测中恢复干净的图像。在这项工作中,我们提出一个有条件的抽样方案,利用通过传播模型所学的先前的推广模型,同时保持与观测的一致。然后,我们将它与一种新颖的方法结合起来,使预先训练的传播去除网络适应其输入。我们研究了两种适应性战略:首先,仅使用退化的图像,而我们提倡的第二个是使用“最近邻居”的图像来进行,这些图像是利用现成的视觉语言模型从不同数据集中检索的退化图像。为了评估我们的方法,我们用两种最先进的公开可得的传播模型,即Stattable Difulpulation和制导导导出Diffulation。我们发现,我们提议的“图像重建适应性传播”方法在超分辨率、分流和文字编辑任务方面取得了显著的改进。