Variational autoencoders (VAEs) are powerful generative modelling methods, however they suffer from blurry generated samples and reconstructions compared to the images they have been trained on. Significant research effort has been spent to increase the generative capabilities by creating more flexible models but often flexibility comes at the cost of higher complexity and computational cost. Several works have focused on altering the reconstruction term of the evidence lower bound (ELBO), however, often at the expense of losing the mathematical link to maximizing the likelihood of the samples under the modeled distribution. Here we propose a new formulation of the reconstruction term for the VAE that specifically penalizes the generation of blurry images while at the same time still maximizing the ELBO under the modeled distribution. We show the potential of the proposed loss on three different data sets, where it outperforms several recently proposed reconstruction losses for VAEs.
翻译:Variational Autoencoder (VAE) 是一种强大的生成模型,但与其训练的图像相比,它们生成的样本和重构之间存在模糊。为增加生成能力,已经花费了大量的研究工作去创建更加灵活的模型,但往往是在增加模型复杂度和计算成本的前提下实现的。一些研究工作侧重于改变 ELBO 的重构项,但这往往是以失去最大化模拟分布下样本似然性的数学联系为代价的。在这里,我们提出了一种新的 VAE 重构项公式,特别惩罚生成模糊图像,同时仍在建模分布下最大化 ELBO。我们展示了所提出的损失在三个不同数据集上的潜力,表现优于已提出的几种 VAE 重构损失。