Likelihood-based deep generative models have recently been shown to exhibit pathological behaviour under the manifold hypothesis as a consequence of using high-dimensional densities to model data with low-dimensional structure. In this paper we propose two methodologies aimed at addressing this problem. Both are based on adding Gaussian noise to the data to remove the dimensionality mismatch during training, and both provide a denoising mechanism whose goal is to sample from the model as though no noise had been added to the data. Our first approach is based on Tweedie's formula, and the second on models which take the variance of added noise as a conditional input. We show that surprisingly, while well motivated, these approaches only sporadically improve performance over not adding noise, and that other methods of addressing the dimensionality mismatch are more empirically adequate.
翻译:由于使用高维密度来模拟低维结构的数据,最近人们已经证明,在多重假设下,基于原始的深层基因模型表现出病理行为,因为使用高维密度来模拟低维结构的数据。在本文件中,我们提出了旨在解决这一问题的两种方法。这两种方法都基于在数据中增加高斯语噪音以消除培训期间的维度不匹配,以及提供一种除去机制,其目的在于从模型中取样,似乎没有增加任何噪音。我们的第一个方法基于特威迪的公式,第二个方法则基于将增加噪音的差异作为有条件投入的模型。我们出乎意料地表明,尽管动机良好,但这些方法只是偶尔改善业绩,而不是增加噪音,而其他处理维度不匹配的方法在经验上更充分。