Variational Autoencoders (VAEs) are one of the most commonly used generative models, particularly for image data. A prominent difficulty in training VAEs is data that is supported on a lower dimensional manifold. Recent work by Dai and Wipf (2019) suggests that on low-dimensional data, the generator will converge to a solution with 0 variance which is correctly supported on the ground truth manifold. In this paper, via a combination of theoretical and empirical results, we show that the story is more subtle. Precisely, we show that for linear encoders/decoders, the story is mostly true and VAE training does recover a generator with support equal to the ground truth manifold, but this is due to the implicit bias of gradient descent rather than merely the VAE loss itself. In the nonlinear case, we show that the VAE training frequently learns a higher-dimensional manifold which is a superset of the ground truth manifold.
翻译:变式自动编码器(VAEs)是最常用的基因模型之一,特别是对图像数据而言。培训VAEs的一个突出的困难是低维维维夫(2019年)支持的数据。Dai和Wipf(2019年)最近的工作表明,在低维数据上,生成器将集中到一个无差异的解决方案,地面事实数据中支持得正确。在本文中,通过理论和经验结果的结合,我们展示了这个故事更为微妙。准确的是我们显示,对于线性编码器/解码器来说,故事大部分是真实的,而VAE培训确实在支持与地面真相多元值相等的情况下恢复了生成器,但这是因为梯度下降的隐含偏向性,而不仅仅是VAE损失本身。在非线性案例中,我们显示VAE培训经常学习高维维的元,这是地面真相多元值的超集。