Real-world data often exhibits long tail distributions with heavy class imbalance, where the majority classes can dominate the training process and alter the decision boundaries of the minority classes. Recently, researchers have investigated the potential of supervised contrastive learning for long-tailed recognition, and demonstrated that it provides a strong performance gain. In this paper, we show that while supervised contrastive learning can help improve performance, past baselines suffer from poor uniformity brought in by imbalanced data distribution. This poor uniformity manifests in samples from the minority class having poor separability in the feature space. To address this problem, we propose targeted supervised contrastive learning (TSC), which improves the uniformity of the feature distribution on the hypersphere. TSC first generates a set of targets uniformly distributed on a hypersphere. It then makes the features of different classes converge to these distinct and uniformly distributed targets during training. This forces all classes, including minority classes, to maintain a uniform distribution in the feature space, improves class boundaries, and provides better generalization even in the presence of long-tail data. Experiments on multiple datasets show that TSC achieves state-of-the-art performance on long-tailed recognition tasks.
翻译:现实世界数据往往显示长期尾尾部分布严重,阶级分布不平衡,多数阶级可以主导培训过程,改变少数民族阶级的决策界限。最近,研究人员调查了监督对比学习的潜力,以获得长尾认知,并表明它提供了巨大的绩效收益。在本文中,我们表明,虽然监督对比学习有助于提高业绩,但过去的基线却因数据分布不平衡带来的不统一性而受到影响。在特征空间分离性差的少数群体阶级样本中,这种不统一性表现得非常差。为解决这一问题,我们提出了有针对性的监督对比学习(TSC),它提高了超视谱特征分布的统一性。TSC首先生成了一套统一分布在超视谱上的目标。然后,它使不同阶级的特点与培训期间的不同和统一分布的目标趋于一致。这迫使所有阶级,包括少数民族阶级,在特征空间中保持统一分布,改善阶级界限,并且即使存在长期数据,也提供更好的概括性。在多个数据集上进行的实验显示,TSC在长期认知上实现了状态。