Guided diffusion is a technique for conditioning the output of a diffusion model at sampling time without retraining the network for each specific task. One drawback of diffusion models, however, is their slow sampling process. Recent techniques can accelerate unguided sampling by applying high-order numerical methods to the sampling process when viewed as differential equations. On the contrary, we discover that the same techniques do not work for guided sampling, and little has been explored about its acceleration. This paper explores the culprit of this problem and provides a solution based on operator splitting methods, motivated by our key finding that classical high-order numerical methods are unsuitable for the conditional function. Our proposed method can re-utilize the high-order methods for guided sampling and can generate images with the same quality as a 250-step DDIM baseline using 32-58% less sampling time on ImageNet256. We also demonstrate usage on a wide variety of conditional generation tasks, such as text-to-image generation, colorization, inpainting, and super-resolution.
翻译:向导是一种在取样时调整扩散模型输出结果而不对网络进行每项具体任务再培训的技术。但是,扩散模型的一个缺点是其缓慢的取样过程。最新技术可以对取样过程采用高阶数字方法,从而加速无制导的取样工作。相反,我们发现,同一技术对制导取样没有作用,对加速率的探索很少。本文探讨了这一问题的罪魁祸首,并基于操作者分裂方法提供了一种解决办法,其动机是,我们的关键发现,古典高阶数字方法不适合有条件功能。我们提议的方法可以重新利用用于制导采样的高阶方法,并且可以使用比图像Net256少32-58%的取样时间,产生与250步的DDIM基线质量相同的图像。我们还展示了对多种有条件生成任务的使用,例如文字到图像生成、彩色化、油漆和超级分辨率。