Normalizing flows are a widely used class of latent-variable generative models with a tractable likelihood. Affine-coupling (Dinh et al, 2014-16) models are a particularly common type of normalizing flows, for which the Jacobian of the latent-to-observable-variable transformation is triangular, allowing the likelihood to be computed in linear time. Despite the widespread usage of affine couplings, the special structure of the architecture makes understanding their representational power challenging. The question of universal approximation was only recently resolved by three parallel papers (Huang et al.,2020;Zhang et al.,2020;Koehler et al.,2020) -- who showed reasonably regular distributions can be approximated arbitrarily well using affine couplings -- albeit with networks with a nearly-singular Jacobian. As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows? In this paper, we show that any log-concave distribution can be approximated using well-conditioned affine-coupling flows. In terms of proof techniques, we uncover and leverage deep connections between affine coupling architectures, underdamped Langevin dynamics (a stochastic differential equation often used to sample from Gibbs measures) and H\'enon maps (a structured dynamical system that appears in the study of symplectic diffeomorphisms). Our results also inform the practice of training affine couplings: we approximate a padded version of the input distribution with iid Gaussians -- a strategy which Koehler et al.(2020) empirically observed to result in better-conditioned flows, but had hitherto no theoretical grounding. Our proof can thus be seen as providing theoretical evidence for the benefits of Gaussian padding when training normalizing flows.
翻译:正常化的流程( 20 ) 是一个广泛使用的隐性可变基因模型类别, 具有可移植的可能性 。 通缩( Dinh 等人, 2014-16 ) 模型是一种特别常见的正常化流程类型, 其潜在到可观察的变异变异的Jacobian 是三角的, 从而有可能在线性时间中进行计算。 尽管广泛使用亲和联结, 结构的特殊结构使得人们理解其代表力具有挑战性。 通缩直流的问题直到最近才通过三份平行文件( Huang 等人, 202020; Zang 等人, 202020; Koehler 等人, 202020) 模型是特别常见的正常化流。 他们展示了正常化的流, 使用一种近似相似的雅各布人网络进行计算。 作为基于可能性的培训的一个障碍, 基本的问题仍然是: 使用精密的直系证据, 哪些分布可以接近于精密的顺流? 在本文中, 任何正统的平直系的平变变变变变变现的分布, 也是我们所观察到的变变的变变的变的变的变的变的变相 。