Diffusion models have achieved remarkable success in generating high-quality images thanks to their novel training procedures applied to unprecedented amounts of data. However, training a diffusion model from scratch is computationally expensive. This highlights the need to investigate the possibility of training these models iteratively, reusing computation while the data distribution changes. In this study, we take the first step in this direction and evaluate the continual learning (CL) properties of diffusion models. We begin by benchmarking the most common CL methods applied to Denoising Diffusion Probabilistic Models (DDPMs), where we note the strong performance of the experience replay with the reduced rehearsal coefficient. Furthermore, we provide insights into the dynamics of forgetting, which exhibit diverse behavior across diffusion timesteps. We also uncover certain pitfalls of using the bits-per-dimension metric for evaluating CL.
翻译:扩散模型因其新颖的训练程序应用于前所未有的大量数据而在生成高质量图像方面取得了巨大成功。然而,从头开始训练扩散模型计算成本高昂。这凸显了需要研究在数据分布改变的同时迭代训练这些模型的可能性。在这项研究中,我们朝着这个方向迈出了第一步,评估了扩散模型的持续学习(CL)特性。我们首先对应用于去噪扩散概率模型(DDPMs)的最常见CL方法进行基准测试,在这里我们注意到使用经验重放和减少重放系数会获得强大的性能。此外,我们提供了遗忘动态的见解,这些动态在扩散时间步长上展现出多样化的行为。我们还揭示了使用每维比特数指标评估CL的某些缺陷。