Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this dimensionality mismatch. We formally prove that degenerate optima are achieved wherein the manifold itself is learned but not the distribution on it, a phenomenon we call manifold overfitting. We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation, and prove that they recover the data-generating distribution in the nonparametric regime, thus avoiding manifold overfitting. We also show that these procedures enable density estimation on the manifolds learned by implicit models, such as generative adversarial networks, hence addressing a major shortcoming of these models. Several recently proposed methods are instances of our two-step procedures; we thus unify, extend, and theoretically justify a large class of models.
翻译:以隐性为基础的或明显的深层基因模型使用神经网络来构建灵活的高维密度。 这种提法直接与多重假设相矛盾, 后者指出, 观察到的数据存在于高维环境空间内嵌入的低维多元体上。 在本文中, 我们调查了在存在这种维度不匹配的情况下, 最大相似性培训的病理。 我们正式证明, 已经实现了堕落的opima, 从而了解了多元性本身, 而不是其分布, 这是一种我们称之为多重过度的现象。 我们建议了一组两步程序, 其中包括一个维度降级步骤, 之后是最大相似性密度估计, 并证明它们恢复了非对称机制内的数据生成分布, 从而避免了多重的过度匹配。 我们还表明, 这些程序使得对隐含模型( 如基因对抗网络 ) 所学的多元值进行密度估计, 从而解决这些模型的重大缺陷。 最近提出的几种方法就是我们两步程序的例子; 因此, 我们统一、 扩展和理论上证明一个大模型。