DiffuseVAE: 高效、可控和高成熟的、来自低多样性低层的下一代 (DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents)

from arxiv, 12 pages main content. Major rework including updated results on controllable synthesis, speed-quality tradeoff and state-of-the-art comparisons

Diffusion Probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand, Variational Autoencoders (VAEs) typically have access to a low-dimensional latent space but exhibit poor sample quality. Despite recent advances, VAEs usually require high-dimensional hierarchies of the latent codes to generate high-quality samples. We present DiffuseVAE, a novel generative framework that integrates VAE within a diffusion model framework, and leverage this to design a novel conditional parameterization for diffusion models. We show that the resulting model can improve upon the unconditional diffusion model in terms of sampling efficiency while also equipping diffusion models with the low-dimensional VAE inferred latent code. Furthermore, we show that the proposed model can generate high-resolution samples and exhibits synthesis quality comparable to state-of-the-art models on standard benchmarks. Lastly, we show that the proposed method can be used for controllable image synthesis and also exhibits out-of-the-box capabilities for downstream tasks like image super-resolution and denoising. For reproducibility, our source code is publicly available at \url{https://github.com/kpandey008/DiffuseVAE}.

翻译：虽然最近取得了一些进步,但VAE通常要求对潜在代码进行高维分级,以生成高质量的样本。我们提出了DiffuseVAE,这是将VAE纳入一个推广模型框架的新型基因化框架,利用这一框架设计一个新的、可解释的、有条件的传播模型。我们表明,所产生的模型可以在取样效率方面改进无条件的传播模型,同时用低维VAE推断的潜伏代码为扩散模型提供设备。此外,我们表明,拟议的模型可以产生高分辨率样本,并展示与标准基准方面的最新模型可比的合成质量。最后,我们表明,拟议的方法可以用于可控图像合成,也可以用于在可控性DAE内展示用于扩散模型的新颖的有条件参数。我们表明,所产生的模型可以在抽样效率方面改进无条件的传播模型,同时用低维维维的推断潜在代码来装备扩散模型。此外,我们表明,拟议的模型可以产生高分辨率样本,并展示与标准基准方面的最新模型相比的合成质量。最后,我们表明,拟议的方法可以用于可控的图像合成,并展示Dbasimmus-comlifor decomlify 用于我们现有的下游图像源/decomliformus。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日