Recent advances in diffusion models bring the state-of-the art performance on image generation tasks. However, empirical results on previous research in diffusion models imply that there is an inverse correlation on performances for density estimation and sample generation. This paper analyzes that the inverse correlation arises because density estimation is mostly contributed from small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training score network on both small and large diffusion time is demanding because of the loss imbalance issue. To successfully train the score network on both small and large diffusion time, this paper introduces a training technique, Soft Truncation, that softens the truncation time for every mini-batch update, which is universally applicable to any types of diffusion models. It turns out that Soft Truncation is equivalent to a diffusion model with a general weight, and we prove the variational bound of the general weighted diffusion model. In view of this variational bound, Soft Truncation becomes a natural way to train the score network. In experiments, Soft Truncation achieves the state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ $256\times 256$, and STL-10 datasets.
翻译:传播模型的最近进步带来了图像生成任务的最新最新表现。然而,以往传播模型研究的经验结果表明,密度估计和样本生成的性能存在反反比关系。本文分析的是,密度估计产生反比关系,因为密度估计主要来自小扩散时间,而样本生成主要取决于大传播时间。然而,小型和大型传播时间的培训分数网络由于损失不平衡问题而要求使用。要成功地在小型和大型传播时间上培训得分网络,本文引入了一种培训技术,即软调速技术,即软化每次微型批量更新的性能时间,这种技术普遍适用于任何类型的传播模型。结果显示,软调速率相当于一个具有一般重量的传播模型,我们证明了一般加权传播模型的变式结合。鉴于这种变式结合,软调调成为培训分数网络的自然方式。在实验中,软调调时,每个微型批量更新都实现了最先进的性能,而这种变速时间则适用于任何类型的传播模型。它证明软调与总重量的传播模型相当,我们证明了一般加权传播模型的变式结合。鉴于这种变装,Soft Truncation成为了一种自然方法。 在实验中,Soft Truncation实现了对CREP-10、CelebA、CelbA、CREBA、CL-10、CLA-Q-10、CL-Q和SL-Q。