与Disooising Difoislation GANs 一起应对创造学习三联运动 (Tackling the Generative Learning Trilemma with Denoising Diffusion GANs)

A wide variety of deep generative models has been developed in the past decade. Yet, these models often struggle with simultaneously addressing three key requirements including: high sample quality, mode coverage, and fast sampling. We call the challenge imposed by these requirements the generative learning trilemma, as the existing models often trade some of them for others. Particularly, denoising diffusion models have shown impressive sample quality and diversity, but their expensive sampling does not yet allow them to be applied in many real-world applications. In this paper, we argue that slow sampling in these models is fundamentally attributed to the Gaussian assumption in the denoising step which is justified only for small step sizes. To enable denoising with large steps, and hence, to reduce the total number of denoising steps, we propose to model the denoising distribution using a complex multimodal distribution. We introduce denoising diffusion generative adversarial networks (denoising diffusion GANs) that model each denoising step using a multimodal conditional GAN. Through extensive evaluations, we show that denoising diffusion GANs obtain sample quality and diversity competitive with original diffusion models while being 2000$\times$ faster on the CIFAR-10 dataset. Compared to traditional GANs, our model exhibits better mode coverage and sample diversity. To the best of our knowledge, denoising diffusion GAN is the first model that reduces sampling cost in diffusion models to an extent that allows them to be applied to real-world applications inexpensively. Project page and code: https://nvlabs.github.io/denoising-diffusion-gan

翻译：在过去的十年中,已经开发了各种深层次的基因模型。然而,这些模型往往同时挣扎于满足三项关键要求,包括:高样本质量、模式覆盖面和快速抽样。我们将这些要求带来的挑战称为基因学习三角形,因为现有模型经常将其中一些方法用于其他模式。特别是,非专利化的传播模型显示了令人印象深刻的样本质量和多样性,但其昂贵的取样还无法在许多现实世界应用中应用这些模型。在本文中,我们认为,这些模型的采样速度缓慢,从根本上归因于高斯假设在分辨步骤步骤方面采取的假设,只有小步骤尺寸才有正当理由。为了能够以大步骤进行分解,从而减少分解步骤的总数,我们建议使用复杂的多式联运分布模式来模拟分解的分布。我们引入了非专利化的遗传性对抗性网络(减少传播的GANs),而每个分解模型都使用一个多式联运的限定的GANSAN。通过广泛的评估,我们表明,分辨的GANs的传播质量和多样性与原始传播模型的竞争性竞争程度只有小步步步,而现在的GANANs的样本覆盖范围要更快地进行。