Diffusion models have emerged as one of the most promising frameworks for deep generative modeling. In this work, we explore the potential of non-uniform diffusion models. We show that non-uniform diffusion leads to multi-scale diffusion models which have similar structure to this of multi-scale normalizing flows. We experimentally find that in the same or less training time, the multi-scale diffusion model achieves better FID score than the standard uniform diffusion model. More importantly, it generates samples $4.4$ times faster in $128\times 128$ resolution. The speed-up is expected to be higher in higher resolutions where more scales are used. Moreover, we show that non-uniform diffusion leads to a novel estimator for the conditional score function which achieves on par performance with the state-of-the-art conditional denoising estimator. Our theoretical and experimental findings are accompanied by an open source library MSDiff which can facilitate further research of non-uniform diffusion models.
翻译:在这项工作中,我们探索了非统一扩散模型的潜力。我们显示,非统一扩散导致的多尺度扩散模型的结构与多尺度正常流的结构相似。我们实验发现,在相同或较少的培训时间里,多尺度扩散模型比标准统一传播模型的FID得分要高。更重要的是,它生成的样本比标准统一传播模型高出4.4倍,在128美元分辨率为128美元。在使用更多尺度的更高分辨率中,速度预期会更高。此外,我们显示,非统一扩散导致对有条件得分函数进行新的估计,该条件得分函数的性能与最先进的有条件的有条件除色估测器相同。我们的理论和实验发现有一个开放源库MSDiff,这可以促进非统一扩散模型的进一步研究。