Generative diffusion processes are an emerging and effective tool for image and speech generation. In the existing methods, the underlying noise distribution of the diffusion process is Gaussian noise. However, fitting distributions with more degrees of freedom could improve the performance of such generative models. In this work, we investigate other types of noise distribution for the diffusion process. Specifically, we introduce the Denoising Diffusion Gamma Model (DDGM) and show that noise from Gamma distribution provides improved results for image and speech generation. Our approach preserves the ability to efficiently sample state in the training diffusion process while using Gamma noise.
翻译:生成扩散过程是产生图像和语音生成的新而有效的工具,在现行方法中,扩散过程的基本噪音分布是高斯噪音,然而,更自由的配送可以改善这种基因模型的性能。在这项工作中,我们调查传播过程的其他噪音分布类型。具体地说,我们引入了Denoising Difmission Gamma模型(DGM),并表明来自伽马分布的噪音为图像和语音生成提供了更好的结果。我们的方法保留了在使用伽马噪音的同时在培训传播过程中有效取样的能力。