Flow-based models typically define a latent space with dimensionality identical to the observational space. In many problems, however, the data does not populate the full ambient data space that they natively reside in, rather inhabiting a lower-dimensional manifold. In such scenarios, flow-based models are unable to represent data structures exactly as their densities will always have support off the data manifold, potentially resulting in degradation of model performance. To address this issue, we propose to learn a manifold prior for flow models that leverage the recently proposed spread divergence towards fixing the crucial problem; the KL divergence and maximum likelihood estimation are ill-defined for manifold learning. In addition to improving both sample quality and representation quality, an auxiliary benefit enabled by our approach is the ability to identify the intrinsic dimension of the manifold distribution.
翻译:以流动为基础的模型通常定义一个与观测空间相同的维度潜在空间。然而,在许多问题中,数据并不包含他们本族居住的整个环境数据空间,而是包含一个低维的多元体。在这种情况下,以流动为基础的模型无法完全代表数据结构,因为其密度总能支持数据多重,可能导致模型性能的退化。为了解决这一问题,我们提议为流动模型先行学习一个多倍的流程模型,该流程模型利用最近提出的分散差异来解决关键问题;KL差异和最大可能性估算对于多重学习来说定义不清。除了提高抽样质量和代表性质量外,我们的方法所促成的一个辅助好处是能够确定多种分布的内在层面。