Variational autoencoders (VAEs) are one of the powerful likelihood-based generative models with applications in many domains. However, they struggle to generate high-quality images, especially when samples are obtained from the prior without any tempering. One explanation for VAEs' poor generative quality is the prior hole problem: the prior distribution fails to match the aggregate approximate posterior. Due to this mismatch, there exist areas in the latent space with high density under the prior that do not correspond to any encoded image. Samples from those areas are decoded to corrupted images. To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior. We train the reweighting factor by noise contrastive estimation, and we generalize it to hierarchical VAEs with many latent variable groups. Our experiments confirm that the proposed noise contrastive priors improve the generative performance of state-of-the-art VAEs by a large margin on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ 256 datasets. Our method is simple and can be applied to a wide variety of VAEs to improve the expressivity of their prior distribution.
翻译:变异自动编码器(VAEs)是在许多领域应用的强大基于概率的基因变异模型之一。然而,它们努力生成高质量的图像,特别是当样品来自先前的样品,而没有产生任何温度时。 VAEs 基因变异质量差的一个解释是先前的洞问题:先前的分布与近似近似后部相匹配。由于这种不匹配,在先前的潜伏空间中存在一些与任何编码图像不相符的密度高的区域。这些地区的样本被解译为腐蚀图像。为了解决这一问题,我们提议了一种以能源为主的先前由基础先前分布产品和再加权因素界定的基于能源的图像,目的是让基数更接近总后部。我们用噪声对比估计法来训练再加权因素,我们将其推广到等级的VAEs,许多潜伏变量组。我们的实验证实,拟议的噪音变异比前改进了国家艺术VAEE的基因变异性性性性性性表现,在MNIST、CIFAR-10、CEA-CelebAs、HAs的简单的变异性数据在HQ上应用。