Diffusion-based generative models have achieved promising results recently, but raise an array of open questions in terms of conceptual understanding, theoretical analysis, algorithm improvement and extensions to discrete, structured, non-Euclidean domains. This work tries to re-exam the overall framework, in order to gain better theoretical understandings and develop algorithmic extensions for data from arbitrary domains. By viewing diffusion models as latent variable models with unobserved diffusion trajectories and applying maximum likelihood estimation (MLE) with latent trajectories imputed from an auxiliary distribution, we show that both the model construction and the imputation of latent trajectories amount to constructing diffusion bridge processes that achieve deterministic values and constraints at end point, for which we provide a systematic study and a suit of tools. Leveraging our framework, we present 1) a first theoretical error analysis for learning diffusion generation models, and 2) a simple and unified approach to learning on data from different discrete and constrained domains. Experiments show that our methods perform superbly on generating images, semantic segments and 3D point clouds.
翻译:以传播为基础的基因模型最近取得了可喜的成果,但在概念理解、理论分析、算法改进和扩展到离散的、结构化的、非欧洲的域方面提出了一系列开放问题。 这项工作试图重新审视整个框架,以便获得更好的理论理解和为任意域的数据开发算法扩展。 通过将扩散模型视为具有未观测到的扩散轨迹的潜伏变量模型,并应用从辅助分布中推断出的潜在轨迹的最大可能性估计(MLE),我们表明,模型的构造和潜在轨迹的估算都相当于构建在终点达到确定值和制约的传播桥进程,我们为此提供了系统研究和工具的套件。 我们利用我们的框架,提出了用于学习传播生成模型的首个理论错误分析,以及2 一种简单统一的方法来学习不同离散和受限制的域的数据。 实验表明,我们的方法在生成图像、断层段和3D点云方面表现超常。