We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2. Our main result is that, assuming accurate score estimates, such SGMs can efficiently sample from essentially any realistic data distribution. In contrast to prior works, our results (1) hold for an $L^2$-accurate score estimate (rather than $L^\infty$-accurate); (2) do not require restrictive functional inequality conditions that preclude substantial non-log-concavity; (3) scale polynomially in all relevant problem parameters; and (4) match state-of-the-art complexity guarantees for discretization of the Langevin diffusion, provided that the score error is sufficiently small. We view this as strong theoretical justification for the empirical success of SGMs. We also examine SGMs based on the critically damped Langevin diffusion (CLD). Contrary to conventional wisdom, we provide evidence that the use of the CLD does not reduce the complexity of SGMs.
翻译:我们为基于分数的生成模型(SGM),如去噪扩散概率模型(DDPM),提供理论收敛保证,这类模型是大规模实际生成模型,例如DALL$\cdot$E 2的基础。我们的主要结果是,假设分数估计准确,这些SGM可以从实际上任何现实数据分布中高效采样。与之前的工作相比,我们的结果(1)适用于$L^2$-准确的分数估计(而不是$L^\infty$-准确);(2)不需要限制性的函数不等式条件,这些条件排除了重要的非对数凹性;(3)在所有相关问题参数上多项式缩放;(4)如果分数误差足够小,则与Langevin扩散离散化的最先进复杂度保证匹配。我们认为这是SGM实证成功的强有力的理论证明。我们还检查了基于临界阻尼Langevin扩散(CLD)的SGM。与传统智慧相反,我们提供了证据表明,使用CLD并不能减少SGM的复杂性。