信息理论结构生成模型 (Information Theoretic Structured Generative Modeling)

R\'enyi's information provides a theoretical foundation for tractable and data-efficient non-parametric density estimation, based on pair-wise evaluations in a reproducing kernel Hilbert space (RKHS). This paper extends this framework to parametric probabilistic modeling, motivated by the fact that R\'enyi's information can be estimated in closed-form for Gaussian mixtures. Based on this special connection, a novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible, because costs are scale-invariant, avoiding high gradient variance while imposing less restrictions on absolute continuity, which is a huge advantage in parametric information theoretic optimization. The implementation employs a single neural network driven by an orthonormal input appended to a single white noise source adapted to learn an infinite Gaussian mixture model (IMoG), which provides an empirically tractable model distribution in low dimensions. To train SGM, we provide three novel variational cost functions, based on R\'enyi's second-order entropy and divergence, to implement minimization of cross-entropy, minimization of variational representations of $f$-divergence, and maximization of the evidence lower bound (conditional probability). We test the framework for estimation of mutual information and compare the results with the mutual information neural estimation (MINE), for density estimation, for conditional probability estimation in Markov models as well as for training adversarial networks. Our preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as the performance of generative adversarial networks.

翻译：R\'enyi'的信息提供了一个理论基础,用于在复制的内核Hilbert空间(RKHS)中进行对称评价,根据对口评估,对可移植和数据高效的非参数性密度进行估算。本文将这一框架扩展为参数性概率模型,其动机是R\'enyi的信息可以以封闭形式对高斯混合物进行估算。基于这一特殊联系,提议了一个称为结构化归正模型(SGM)的新基因模型框架,可以使最直接的优化成为可能,因为成本是规模变化性的,避免了高梯度差异,同时对绝对连续性施加了较少的限制,这是对准度信息理论性优化的一个巨大优势。实施这一框架使用一个单一的神经网络,由一种正态投入驱动,附加于一个白色的封闭式模型,以学习无限高斯混合混合物模型(IMG),该模型在低维度方面提供可实验性模型分布。培训SGMM,我们根据R\'enyy的第二级变异性变差模型提供三种新的变价模型,用于大幅度的变差值模型和变差性模型,用于在美元的相互变价数据化模型中进行最起码的变化的测试,以最大限度地变化的模型,以尽量减少的模型为最低的模型。