Score-based generative models (SGMs) have recently emerged as a promising class of generative models. The key idea is to produce high-quality images by recurrently adding Gaussian noises and gradients to a Gaussian sample until converging to the target distribution, a.k.a. the diffusion sampling. To ensure stability of convergence in sampling and generation quality, however, this sequential sampling process has to take a small step size and many sampling iterations (e.g., 2000). Several acceleration methods have been proposed with focus on low-resolution generation. In this work, we consider the acceleration of high-resolution generation with SGMs, a more challenging yet more important problem. We prove theoretically that this slow convergence drawback is primarily due to the ignorance of the target distribution. Further, we introduce a novel Target Distribution Aware Sampling (TDAS) method by leveraging the structural priors in space and frequency domains. Extensive experiments on CIFAR-10, CelebA, LSUN, and FFHQ datasets validate that TDAS can consistently accelerate state-of-the-art SGMs, particularly on more challenging high resolution (1024x1024) image generation tasks by up to 18.4x, whilst largely maintaining the synthesis quality. With fewer sampling iterations, TDAS can still generate good quality images. In contrast, the existing methods degrade drastically or even fails completely
翻译:最近出现了基于分数的基因变异模型(SGM),这是一个很有希望的基因变异模型。关键的想法是,通过在高萨样本中反复增加高斯噪音和梯度,直到与目标分布(a.k.a.扩散抽样)相融合,从而产生高质量的图像。然而,为了确保采样和生成质量的趋同性稳定,这一顺序采样过程必须采用小步数和许多抽样迭代(例如,2000年)。提出了若干加速方法,重点是低分辨率生成。在这项工作中,我们认为加速高分辨率生成高分辨率的SGM,这是一个更具有挑战性的、更重要的问题。我们从理论上证明,这种缓慢趋同退的主要原因是对目标分布的无知。此外,我们引入了一种新的目标分布(TDAS)方法,即利用空间和频域的结构前程,进行大规模的缩略图10、CelebA、LSUN和FFHQ数据集的大规模实验,证明TDAS仍然能够不断加快高分辨率的SGM生成速度,特别是高质量的18GM(10x)图像的降解率,通过高清晰的合成方法,可以产生更具有挑战性的图像。 (1024) 并保持现有高分辨率的压的图像。