While efficient distribution learning is no doubt behind the groundbreaking success of diffusion modeling, its theoretical guarantees are quite limited. In this paper, we provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling for well-known function spaces. The highlight of this paper is that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates in the total variation distance and in the Wasserstein distance of order one. Furthermore, we extend our theory to demonstrate how diffusion models adapt to low-dimensional data distributions. We expect these results advance theoretical understandings of diffusion modeling and its ability to generate verisimilar outputs.
翻译:虽然有效的分配学习无疑是传播模型取得突破性成功的基础,但其理论保障却非常有限。在本文中,我们首次对众所周知的功能空间的近似和一般传播模型能力进行了严格分析。本文的要点是,当真正的密度函数属于贝索夫空间,而实证得分匹配损失被适当缩小时,生成的数据分布在总变异距离和瓦塞斯坦距离一等中达到了近乎微小的最佳估计率。此外,我们扩展了我们的理论,以表明传播模型如何适应低维度数据分布。我们期望这些结果能增进对传播模型的理论理解及其产生不同产出的能力。</s>