We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques developed in chemical physics and exploit our continuous time framework to derive high performance samplers that we show can outperform discrete time methods for discrete data. The continuous time treatment also enables us to derive a novel theoretical result bounding the error between the generated sample distribution and the true data distribution.
翻译:我们为分离数据扩散模型的分解提供了第一个完整的连续时间框架。这是通过制定前向降噪过程和相应的逆向时间基因化过程作为连续时间标记链(CTMCs)实现的。该模型可以使用连续时间版的ELBO进行高效培训。我们利用化学物理开发的技术模拟高维CTMC,并利用我们连续的时间框架来获取高性能的采样器,我们显示这些采样器的性能优于离散数据的时间方法。持续的时间处理还使我们能够得出一个新的理论结果,将生成的样本分布与真实数据分布之间的错误捆绑在一起。