Employing a forward diffusion chain to gradually map the data to a noise distribution, diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain. However, this approach is slow and costly because it needs many forward and reverse steps. We propose a faster and cheaper approach that adds noise not until the data become pure random noise, but until they reach a hidden noisy-data distribution that we can confidently learn. Then, we use fewer reverse steps to generate data by starting from this hidden distribution that is made similar to the noisy data. We reveal that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior. Experimental results show even with a significantly smaller number of reverse diffusion steps, the proposed truncated diffusion probabilistic models can provide consistent improvements over the non-truncated ones in terms of performance in both unconditional and text-guided image generations.
翻译:使用前方扩散链将数据逐步映射成噪音分布图,基于扩散的基因化模型通过推导反向扩散链学习如何生成数据。然而,这种方法缓慢而昂贵,因为它需要许多前方和反向步骤。我们建议一种更快捷、更便宜的方法,在数据变成纯随机噪音之前不会增加噪音,而是在数据达到我们有信心地学习到的隐藏的噪音数据分布之前不会增加噪音。然后,我们从这种与噪音数据相似的隐性分布开始,用较少的反向步骤生成数据。我们发现,拟议的模型可以被描绘成一个对抗性自动编码器,通过扩散过程和可学习的隐含的先前过程来增强。实验结果显示,即使反向扩散步骤数量要少得多,拟议的短速扩散概率模型可以在无条件和文字引导的图像世代的性能方面对非随机图像进行持续的改进。