一种频率统计视角下的变分推断、自编码器与扩散模型导论 (A Frequentist Statistical Introduction to Variational Inference, Autoencoders, and Diffusion Models)

While Variational Inference (VI) is central to modern generative models like Variational Autoencoders (VAEs) and Denoising Diffusion Models (DDMs), its pedagogical treatment is split across disciplines. In statistics, VI is typically framed as a Bayesian method for posterior approximation. In machine learning, however, VAEs and DDMs are developed from a Frequentist viewpoint, where VI is used to approximate a maximum likelihood estimator. This creates a barrier for statisticians, as the principles behind VAEs and DDMs are hard to contextualize without a corresponding Frequentist introduction to VI. This paper provides that introduction: we explain the theory for VI, VAEs, and DDMs from a purely Frequentist perspective, starting with the classical Expectation-Maximization (EM) algorithm. We show how VI arises as a scalable solution for intractable E-steps and how VAEs and DDMs are natural, deep-learning-based extensions of this framework, thereby bridging the gap between classical statistical inference and modern generative AI.

翻译：尽管变分推断（VI）是现代生成模型（如变分自编码器VAE和去噪扩散模型DDM）的核心方法，其教学阐述却分散在不同学科领域。在统计学中，VI通常被构建为一种用于后验近似的贝叶斯方法。然而在机器学习领域，VAE和DDM是从频率学派的视角发展的，其中VI被用于近似最大似然估计量。这为统计学家造成了理解障碍，因为若缺乏对应的频率学派VI导论，VAE和DDM背后的原理难以被置于合适的理论框架中。本文正是为此提供导论：我们从纯频率学派视角阐释VI、VAE和DDM的理论，从经典的期望最大化（EM）算法出发。我们展示了VI如何作为处理难解E步的可扩展解决方案而出现，以及VAE和DDM如何成为该框架基于深度学习的自然延伸，从而弥合了经典统计推断与现代生成式人工智能之间的鸿沟。

相关内容

视觉识别系统

关注 11

视觉识别系统出自“头脑风暴”一词。所谓头脑风暴（Brain-storming）系统是运用系统的、统一的视觉符号系统。视觉识别是静态的识别符号具体化、视觉化的传达形式，项目最多，层面最广，效果更直接。视觉识别系统属于CIS中的VI，用完整、体系的视觉传达体系，将企业理念、文化特质、服务内容、企业规范等抽象语意转换为具体符号的概念，塑造出独特的企业形象。视觉识别系统分为基本要素系统和应用要素系统两方面。基本要素系统主要包括：企业名称、企业标志、标准字、标准色、象征图案、宣传口语、市场行销报告书等。应用系统主要包括：办公事务用品、生产设备、建筑环境、产品包装、广告媒体、交通工具、衣着制服、旗帜、招牌、标识牌、橱窗、陈列展示等。视觉识别（VI）在CI系统大众所接受，据有主导的地位。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日