Diffusion models have found widespread adoption in various areas. However, their sampling process is slow because it requires hundreds to thousands of network evaluations to emulate a continuous process defined by differential equations. In this work, we use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose parallel decoding method that generates images with only one model forward pass. We propose \textit{diffusion model sampling with neural operator} (DSNO) that maps the initial condition, i.e., Gaussian distribution, to the continuous-time solution trajectory of the reverse diffusion process. To model the temporal correlations along the trajectory, we introduce temporal convolution layers that are parameterized in the Fourier space into the given diffusion model backbone. We show our method achieves state-of-the-art FID of 4.12 for CIFAR-10 and 8.35 for ImageNet-64 in the one-model-evaluation setting.
翻译:然而,它们的取样过程很慢,因为它需要成百上千的网络评价来效仿由差异方程式定义的连续过程。 在这项工作中,我们使用神经操作员,这是解决概率流差异方程式的有效方法,可以加速扩散模型的取样过程。与其他具有相继性质的快速采样方法相比,我们首先提出平行解码方法,只用一个模型来生成图像。我们建议用神经操作员来绘制初始状态图,即高斯分布图和反向扩散过程的连续时间解决方案轨迹。为了模拟轨迹上的时间相关性,我们把在Fourier空间中参数化的时变层引入到给定的传播模型主干线中。我们展示我们的方法在单一模型评估环境中为CIFAR-10和图像网络-64实现了4.12的状态和8.35的FID。