Popular generative model learning methods such as Generative Adversarial Networks (GANs), and Variational Autoencoders (VAE) enforce the latent representation to follow simple distributions such as isotropic Gaussian. In this paper, we argue that learning a complicated distribution over the latent space of an auto-encoder enables more accurate modeling of complicated data distributions. Based on this observation, we propose a two stage optimization procedure which maximizes an approximate implicit density model. We experimentally verify that our method outperforms GANs and VAEs on two image datasets (MNIST, CELEB-A). We also show that our approach is amenable to learning generative model for sequential data, by learning to generate speech and music.
翻译:大众基因模型学习方法,如基因反转网络(GANs)和变异自动编码器(VAE),强制执行潜在表示法,以遵循像等等简单分布法,如等色化高斯。在本文中,我们认为,在自动编码器潜在空间的复杂分布有助于更准确地建模复杂的数据分布。基于这一观察,我们提议了两个阶段优化程序,以最大限度地扩大一个近似隐含密度模型。我们实验性地核实我们的方法在两个图像数据集(MNIST、CELEB-A)上优于GANs和VAEs。 我们还表明,我们的方法是学习顺序数据的基因模型,通过学习生成语音和音乐。