Diffusion models have shown promising results on single-image super-resolution and other image- to-image translation tasks. Despite this success, they have not outperformed state-of-the-art GAN models on the more challenging blind super-resolution task, where the input images are out of distribution, with unknown degradations. This paper introduces SR3+, a diffusion-based model for blind super-resolution, establishing a new state-of-the-art. To this end, we advocate self-supervised training with a combination of composite, parameterized degradations for self-supervised training, and noise-conditioing augmentation during training and testing. With these innovations, a large-scale convolutional architecture, and large-scale datasets, SR3+ greatly outperforms SR3. It outperforms Real-ESRGAN when trained on the same data, with a DRealSR FID score of 36.82 vs. 37.22, which further improves to FID of 32.37 with larger models, and further still with larger training sets.
翻译:投影模型在单图像超分辨率和其他图像到图像翻译任务方面显示了有希望的结果。尽管取得了这一成功,但在更具挑战性的超分辨率任务(输入图像在传播过程中无法传播,且降解程度不明)上,它们并没有超过最先进的GAN模型。本文介绍了SR3+,一种基于传播的盲超级分辨率模型,建立了一个新的最新艺术状态。为此,我们倡导自我监督培训,结合综合采用综合的、参数化的退化,进行自我监督培训,并在培训和测试期间进行扩增噪音。有了这些创新,一个大型革命结构,以及大型数据集,SR3+大大优于SR3。当接受关于同一数据的培训时,它比RealSRFID得分(36.82比37.22分)要快,而DRESRFID得分是36.82比37比37,这进一步改进了32.37的FID和更大的模型,并且还有更大的培训数据集。