A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm.
翻译:机器学习的一个中心问题是,利用极灵活的概率分布组合来模拟复杂的数据集,在这种组合中,学习、取样、推断和评价仍然可以分析或计算。在这里,我们制定了一种同时实现灵活性和可移动性的方法。由非平衡统计物理学所启发的基本想法是通过迭接的前向扩散过程,在数据分布中系统和缓慢地破坏结构。然后我们学习反向扩散过程,恢复数据结构,产生一种高度灵活和可移植的数据基因化模型。这种方法使我们能够迅速学习、抽样并评估具有数千层或时间步骤的深基因模型的概率,以及根据所学模型计算有条件和后代概率。我们另外还发布了一种开放源参考算法。