Image deblurring is an ill-posed problem with multiple plausible solutions for a given input image. However, most existing methods produce a deterministic estimate of the clean image and are trained to minimize pixel-level distortion. These metrics are known to be poorly correlated with human perception, and often lead to unrealistic reconstructions. We present an alternative framework for blind deblurring based on conditional diffusion models. Unlike existing techniques, we train a stochastic sampler that refines the output of a deterministic predictor and is capable of producing a diverse set of plausible reconstructions for a given input. This leads to a significant improvement in perceptual quality over existing state-of-the-art methods across multiple standard benchmarks. Our predict-and-refine approach also enables much more efficient sampling compared to typical diffusion models. Combined with a carefully tuned network architecture and inference procedure, our method is competitive in terms of distortion metrics such as PSNR. These results show clear benefits of our diffusion-based method for deblurring and challenge the widely used strategy of producing a single, deterministic reconstruction.
翻译:图像分流是一个错误的问题,因为对特定输入图像存在多种可信的解决办法。然而,大多数现有方法都对清洁图像进行确定性估计,并受过培训以尽量减少像素级扭曲。这些衡量尺度与人类的感知不甚相关,往往会导致不现实的重建。我们提出了一个基于有条件扩散模型的盲人分流的替代框架。与现有技术不同,我们训练了一个随机抽样器,以完善确定性预测器的输出,并能够为特定输入生成一套不同的貌似重建。这导致在多种标准基准中大大改进现有最新方法的感知质量。我们的预测和反应方法也使得与典型的传播模型相比,能够进行更有效率的抽样。与仔细调整的网络架构和推断程序相结合,我们的方法在扭曲指标(如PSNR)方面是竞争性的。这些结果表明,我们基于扩散的分流法可以明显的好处,对广泛使用的单一、确定性重建战略提出挑战。