Score-based generative modeling (SGM) has grown to be a hugely successful method for learning to generate samples from complex data distributions such as that of images and audio. It is based on evolving an SDE that transforms white noise into a sample from the learned distribution, using estimates of the score function, or gradient log-pdf. Previous convergence analyses for these methods have suffered either from strong assumptions on the data distribution or exponential dependencies, and hence fail to give efficient guarantees for the multimodal and non-smooth distributions that arise in practice and for which good empirical performance is observed. We consider a popular kind of SGM -- denoising diffusion models -- and give polynomial convergence guarantees for general data distributions, with no assumptions related to functional inequalities or smoothness. Assuming $L^2$-accurate score estimates, we obtain Wasserstein distance guarantees for any distribution of bounded support or sufficiently decaying tails, as well as TV guarantees for distributions with further smoothness assumptions.
翻译:基于分数的基因模型(SGM)已经发展成为学习从图像和音频等复杂数据分布中生成样本的一个非常成功的方法,其基础是不断演变的SDE,利用分数函数的估计数或梯度日志-pdf,将白色噪音转化为从所学分布中提取的样本。 这些方法以往的趋同分析要么由于对数据分布的强烈假设或指数依赖性而受到影响,因此未能为在实践中出现并观察到良好实绩的多式和非移动分布提供有效保障。我们认为一种流行的SGM -- -- 分解扩散模型 -- -- 并为一般数据分布提供多重趋同保证,而没有关于功能不平等或平稳的假设。假设$L ⁇ 2美元准确的得分数估计,我们为任何捆绑支持的分布或足够腐烂的尾巴获得瓦瑟斯坦距离保证,以及为具有进一步平稳假设的分布提供电视担保。