Diffusion-based models have shown the merits of generating high-quality visual data while preserving better diversity in recent studies. However, such observation is only justified with curated data distribution, where the data samples are nicely pre-processed to be uniformly distributed in terms of their labels. In practice, a long-tailed data distribution appears more common and how diffusion models perform on such class-imbalanced data remains unknown. In this work, we first investigate this problem and observe significant degradation in both diversity and fidelity when the diffusion model is trained on datasets with class-imbalanced distributions. Especially in tail classes, the generations largely lose diversity and we observe severe mode-collapse issues. To tackle this problem, we set from the hypothesis that the data distribution is not class-balanced, and propose Class-Balancing Diffusion Models (CBDM) that are trained with a distribution adjustment regularizer as a solution. Experiments show that images generated by CBDM exhibit higher diversity and quality in both quantitative and qualitative ways. Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
翻译:类平衡扩散模型
摘要:最近的研究表明,基于扩散的模型在生成高质量的视觉数据的同时,更好地保留了数据的多样性。然而,这种观察结果仅在数据分布经过精心预处理并在标签上均匀分布的情况下成立。实际上,长尾数据分布更为常见,扩散模型在这种类不平衡的数据上的性能仍然未知。在这项工作中,我们首先研究了这个问题,并观察到当扩散模型在具有类不平衡分布的数据集上训练时会出现明显的多样性和保真度下降。特别是在尾部类别中,产生的图像大幅度丧失了多样性,我们观察到了严重的模式崩溃问题。为了解决这个问题,我们从数据分布不平衡的假设出发,提出了带有分布调整正则化器的类平衡扩散模型(CBDM)作为解决方案。实验表明,CBDM 生成的图像在定量和定性方面都具有更高的多样性和质量。我们的方法在 CIFAR100/CIFAR100LT 数据集上进行了生成结果的基准测试,并在下游识别任务上展现出了卓越的性能。