The ability of likelihood-based probabilistic models to generalize to unseen data is central to many machine learning applications such as lossless compression. In this work, we study the generalization of a popular class of probabilistic model - the Variational Auto-Encoder (VAE). We discuss the two generalization gaps that affect VAEs and show that overfitting is usually dominated by amortized inference. Based on this observation, we propose a new training objective that improves the generalization of amortized inference. We demonstrate how our method can improve performance in the context of image modeling and lossless compression.
翻译:基于概率概率的概率模型能够向看不见的数据进行概括,这是许多机器学习应用的核心,例如无损压缩。在这项工作中,我们研究了流行的一类概率模型的普及性,即变式自动计算器(VAE),我们讨论了影响VAE的两种概括性差距,并表明过度配制通常以摊销性推理为主。基于这一观察,我们提出了一个新的培训目标,改进摊销性推理的概括性。我们展示了我们的方法如何在图像建模和无损压缩方面改进性能。