Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples. There has been a large body of work to train discriminative models for visual recognition on long-tailed distribution. In contrast, we aim to train conditional Generative Adversarial Networks, a class of image generation models on long-tailed distributions. We find that similar to recognition, state-of-the-art methods for image generation also suffer from performance degradation on tail classes. The performance degradation is mainly due to class-specific mode collapse for tail classes, which we observe to be correlated with the spectral explosion of the conditioning parameter matrix. We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse, which results in diverse and plausible image generation even for tail classes. We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data. Extensive experiments demonstrate the efficacy of our regularizer on long-tailed datasets with different degrees of imbalance.
翻译:深长的远程学习旨在培训实用的深层网络,了解实际的、真实世界的不平衡分布,其中多数尾品类标签与少数样本相关。在长尾品分布方面,已经开展了大量工作,以培训歧视模式,以进行视觉识别。相比之下,我们的目标是培训有条件的基因反影网络,这是长尾品分布的图像生成模型的类别。我们发现,与识别相似的、最先进的图像生成方法也因尾品类的性能退化而受到影响。性能退化主要是由于尾品类特定模式崩溃,我们观察到这与调制参数矩阵的光谱爆炸相关。我们提议建立一个新型的光谱调节器(GSR),防止光谱爆炸模式崩溃,即使对尾品分布也会导致多样化和可信的图像生成。我们发现,光谱与现有的增强和规范化技术有效结合,导致长尾品数据上最先进的图像生成性能。广泛的实验表明,我们固定剂在长尾片数据集上的功效与不同程度的不平衡性能。