Denoising diffusions are state-of-the-art generative models which exhibit remarkable empirical performance and come with theoretical guarantees. The core idea of these models is to progressively transform the empirical data distribution into a simple Gaussian distribution by adding noise using a diffusion. We obtain new samples whose distribution is close to the data distribution by simulating a "denoising" diffusion approximating the time reversal of this "noising" diffusion. This denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities, known as scores, obtained using score matching. Such models can be easily extended to perform approximate posterior simulation in high-dimensional scenarios where one can only sample from the prior and simulate synthetic observations from the likelihood. These methods have been primarily developed for data on $\mathbb{R}^d$ while extensions to more general spaces have been developed on a case-by-case basis. We propose here a general framework which not only unifies and generalizes this approach to a wide class of spaces but also leads to an original extension of score matching. We illustrate the resulting class of denoising Markov models on various applications.
翻译:这些模型的核心想法是逐步将经验数据分布转化为简单的高山分布,通过使用扩散方式添加噪音。我们获得新的样本,这些样本的分布接近数据分布,方法是模拟“消化”扩散,接近于“消化”扩散的时间逆转。这种消化扩散依靠的是使用得分匹配而获得的无记名数据密度(称为得分)的对数衍生物近似值。这些模型可以很容易地扩展,在高维情景下进行近似远距模拟,其中只能从先前和模拟综合观察中从可能性中提取样本。这些方法主要是为关于$\mathbb{R ⁇ d$的数据而开发的,同时在逐个案例的基础上开发了对更一般空间的扩展。我们在此提出了一个总体框架,不仅将这一方法(称为分数)统一和概括到广泛的空间,而且还导致分数匹配的原始扩展。我们举例说明了由此产生的在各种应用中解析Markov模型的类别。