While Variational Inference (VI) is central to modern generative models like Variational Autoencoders (VAEs) and Denoising Diffusion Models (DDMs), its pedagogical treatment is split across disciplines. In statistics, VI is typically framed as a Bayesian method for posterior approximation. In machine learning, however, VAEs and DDMs are developed from a Frequentist viewpoint, where VI is used to approximate a maximum likelihood estimator. This creates a barrier for statisticians, as the principles behind VAEs and DDMs are hard to contextualize without a corresponding Frequentist introduction to VI. This paper provides that introduction: we explain the theory for VI, VAEs, and DDMs from a purely Frequentist perspective, starting with the classical Expectation-Maximization (EM) algorithm. We show how VI arises as a scalable solution for intractable E-steps and how VAEs and DDMs are natural, deep-learning-based extensions of this framework, thereby bridging the gap between classical statistical inference and modern generative AI.
翻译:尽管变分推断(VI)是现代生成模型(如变分自编码器VAE和去噪扩散模型DDM)的核心方法,其教学阐述却分散在不同学科领域。在统计学中,VI通常被构建为一种用于后验近似的贝叶斯方法。然而在机器学习领域,VAE和DDM是从频率学派的视角发展的,其中VI被用于近似最大似然估计量。这为统计学家造成了理解障碍,因为若缺乏对应的频率学派VI导论,VAE和DDM背后的原理难以被置于合适的理论框架中。本文正是为此提供导论:我们从纯频率学派视角阐释VI、VAE和DDM的理论,从经典的期望最大化(EM)算法出发。我们展示了VI如何作为处理难解E步的可扩展解决方案而出现,以及VAE和DDM如何成为该框架基于深度学习的自然延伸,从而弥合了经典统计推断与现代生成式人工智能之间的鸿沟。