Score-based Generative Models (SGMs) have achieved state-of-the-art synthesis results on diverse tasks. However, the current design space of the forward diffusion process is largely unexplored and often relies on physical intuition or simplifying assumptions. Leveraging results from the design of scalable Bayesian posterior samplers, we present a complete recipe for constructing forward processes in SGMs, all of which are guaranteed to converge to the target distribution of interest. We show that several existing SGMs can be cast as specific instantiations of this parameterization. Furthermore, building on this recipe, we construct a novel SGM: Phase Space Langevin Diffusion (PSLD), which performs score-based modeling in a space augmented with auxiliary variables akin to a physical phase space. We show that PSLD outperforms competing baselines in terms of sample quality and the speed-vs-quality tradeoff across different samplers on various standard image synthesis benchmarks. Moreover, we show that PSLD achieves sample quality comparable to state-of-the-art SGMs (FID: 2.10 on unconditional CIFAR-10 generation), providing an attractive alternative as an SGM backbone for further development. We will publish our code and model checkpoints for reproducibility at https://github.com/mandt-lab/PSLD.
翻译:在各种任务上,基于分数的生成模型(SGM)取得了最新综合结果,然而,目前前方扩散过程的设计空间基本上尚未探索,而且往往依赖物理直觉或简化假设。从设计可缩放的Bayesian 后方取样器中,我们提出了一个在SGMs中构建前方进程的完整配方,所有这些模型都保证与目标分布一致。我们表明,现有的几个SGMs可以作为这一参数的具体即时性进行。此外,我们在这一配方的基础上,我们建造了一个新型SGM:SGM:S阶段空间:Squart Langevin Difulation(PSLD),在配有类似物理阶段空间的辅助变量增强的空间上进行基于分的模型。我们表明,PSLD在样本质量和速度质量交易中,在各种标准图像综合基准上,都比不同取样员之间的相互竞争的基线和速度-质量交易量。我们还表明,PSLD能够达到与州/SGM(F:2.10关于无条件的CIFD/10)的Sproduction Stepalb Stampserviews recomsestration 提供了一个有吸引力的替代标准。</s>