扩散模型在视觉中的应用：综述 (Diffusion Models in Vision: A Survey)

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion process, step by step. Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens, i.e. low speeds due to the high number of steps involved during sampling. In this survey, we provide a comprehensive review of articles on denoising diffusion models applied in vision, comprising both theoretical and practical contributions in the field. First, we identify and present three generic diffusion modeling frameworks, which are based on denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. We further discuss the relations between diffusion models and other deep generative models, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing flows. Then, we introduce a multi-perspective categorization of diffusion models applied in computer vision. Finally, we illustrate the current limitations of diffusion models and envision some interesting directions for future research.

翻译：降噪扩散模型是计算机视觉领域的一个新兴话题，展示了生成建模领域中的显着结果。扩散模型是一种深层生成模型，基于两个阶段，即向前扩散阶段和向后扩散阶段。在向前扩散阶段中，通过逐步添加高斯噪声来逐步扰动输入数据。在向后阶段，模型的任务是通过逐步学习逆转扩散过程来恢复原始输入数据。尽管已知计算负担较重（即由于样品中涉及的步骤数量高而导致的低速度），但扩散模型因生成的样本质量和多样性而广受赞赏。在本综述中，我们提供了一个关于应用于视觉中的降噪扩散模型文章的全面回顾，包括该领域中的理论和实践贡献。首先，我们确定并介绍了三个一般的扩散建模框架，即基于去噪扩散概率模型，噪声条件评分网络和随机微分方程。我们进一步讨论了扩散模型与其他深层生成模型之间的关系，包括变分自动编码器、生成对抗网络、能量模型、自回归模型和归一化流。接着，我们引入了应用于计算机视觉中的扩散模型的多角度分类。最后，我们阐述了扩散模型当前的局限性，并展望了未来研究的一些有趣方向。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/