Flow-based generative models typically define a latent space with dimensionality identical to the observational space. In many problems, however, the data does not populate the full ambient data-space that they natively reside in, rather inhabiting a lower-dimensional manifold. In such scenarios, flow-based models are unable to represent data structures exactly as their density will always have support off the data manifold, potentially resulting in degradation of model performance. In addition, the requirement for equal latent and data space dimensionality can unnecessarily increase complexity for contemporary flow models. Towards addressing these problems, we propose to learn a manifold prior that affords benefits to both sample generation and representation quality. An auxiliary benefit of our approach is the ability to identify the intrinsic dimension of the data distribution.
翻译:以流动为基础的基因化模型通常给潜在空间下定义,其维度与观测空间相同,但在许多问题上,数据并不包含他们原住的全部环境数据空间,而是包含一个较低维度的方方面面。在这样的情况中,以流动为基础的模型无法完全代表数据结构,因为其密度总能支持数据方方面面,可能导致模型性能的退化。此外,对等潜在空间和数据空间维度的要求可能不必要地增加当代流动模型的复杂性。为了解决这些问题,我们提议学习多种前科,既有利于样本生成,又有利于展示质量。我们方法的一个附带好处是能够确定数据分布的内在层面。