Denoising Diffusion Probabilistic Models have shown an impressive generation quality, although their long sampling chain leads to high computational costs. In this paper, we observe that a long sampling chain also leads to an error accumulation phenomenon, which is similar to the exposure bias problem in autoregressive text generation. Specifically, we note that there is a discrepancy between training and testing, since the former is conditioned on the ground truth samples, while the latter is conditioned on the previously generated results. To alleviate this problem, we propose a very simple but effective training regularization, consisting in perturbing the ground truth samples to simulate the inference time prediction errors. We empirically show that the proposed input perturbation leads to a significant improvement of the sample quality while reducing both the training and the inference times. For instance, on CelebA 64$\times$64, we achieve a new state-of-the-art FID score of 1.27, while saving 37.5% of the training time. The code is publicly available at https://github.com/forever208/DDPM-IP
翻译:长期的取样链也会导致错误积累现象,这与自动递减文本生成过程中的暴露偏差问题相似。具体地说,我们注意到,培训和测试之间存在差异,因为前者以地面真相样本为条件,而后者则以先前产生的结果为条件。为了缓解这一问题,我们提议对培训进行简单而有效的规范化,包括破坏地面真相样本以模拟推断时间预测错误。我们从经验上表明,拟议输入的扰动导致样本质量的显著改善,同时减少了培训和推断时间。例如,在CelebA 64$\times 64上,我们取得了一个新的“FID”得分1.27,同时节省了培训时间的37.5%。该代码公布在https://github.com/forever208/DDPM-IP上。例如,在CelebA 64$times 64上,我们实现了一个新的“FID”得分为1.27,同时节省了培训时间的37.5%。