The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO). Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies: the (negative) entropy of the prior distribution, the expected (negative) entropy of the observable distribution, and the average entropy of the variational distributions (the latter is already part of the ELBO). Our derived analytical results are exact and apply for small as well as for intricate deep networks for encoder and decoder. Furthermore, they apply for finitely and infinitely many data points and at any stationary point (including local maxima and saddle points). The result implies that the ELBO can for standard VAEs often be computed in closed-form at stationary points while the original ELBO requires numerical approximations of integrals. As a main contribution, we provide the proof that the ELBO for VAEs is at stationary points equal to entropy sums. Numerical experiments then show that the obtained analytical results are sufficiently precise also in those vicinities of stationary points that are reached in practice. Furthermore, we discuss how the novel entropy form of the ELBO can be used to analyze and understand learning behavior. More generally, we believe that our contributions can be useful for future theoretical and practical studies on VAE learning as they provide novel information on those points in parameters space that optimization of VAEs converges to.
翻译:变异自动编码器( VAE) 的核心目标函数是其变异下限( ELBO) 。 在这里, 我们显示, 对标准值( 即 Gaussian) VAEs 来说, ELBO 的计算结果准确, 并适用于小的和复杂的深层次网络, 用于编码和解码器。 此外, 它们适用于有限和无限的许多数据点, 以及任何固定点( 包括本地的峰值和垫位点) 所给出的数值。 结果显示, 先前的可观测分布的( 负) 和变异分布( 后者已经是 ELBO 的一部分) 的预期( 负) 酶, 以及 平均的变异分布( 后者已经是ELBO ) 。 作为主要贡献, 我们提供的证据表明, 我们的 VAE 的 ELBO 是小和深层次的网络 。 此外, 它们应用了有限和无限的 VA 多个数据点, 以及任何固定的 数据点( 包括本地的 VA ), 也能够理解我们所获取的变异的变异的实验结果。