Score-based modeling through stochastic differential equations (SDEs) has provided a new perspective on diffusion models, and demonstrated superior performance on continuous data. However, the gradient of the log-likelihood function, i.e., the score function, is not properly defined for discrete spaces. This makes it non-trivial to adapt \textcolor{\cdiff}{the score-based modeling} to categorical data. In this paper, we extend diffusion models to discrete variables by introducing a stochastic jump process where the reverse process denoises via a continuous-time Markov chain. This formulation admits an analytical simulation during backward sampling. To learn the reverse process, we extend score matching to general categorical data and show that an unbiased estimator can be obtained via simple matching of the conditional marginal distributions. We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
翻译:通过Stochestic 差分方程( SDEs) 进行基于分数的建模为扩散模型提供了新的视角,并展示了连续数据的优异性能。 但是,对数值相似函数的梯度,即分数函数,对于离散空格没有适当的定义。 这使得将\ textcolor=cdiff}基于分数的建模转换成绝对数据的工作非三重性。 在本文中, 我们通过引入一个随机跳动过程,让反向进程通过连续时间 Markov 链沉积, 将扩散模型推广到离散变量中。 这个配方在后向取样中接受了一个分析模拟。 为了学习反向进程, 我们扩展了对普通绝对数据的比值, 并表明可以通过简单的匹配条件边际分布获得一个公正的估计符。 我们展示了一套合成和真实世界音乐和图像基准的拟议方法的有效性 。</s>