Long-Tailed Semi-Supervised Learning (LTSSL) aims to learn from class-imbalanced data where only a few samples are annotated. Existing solutions typically require substantial cost to solve complex optimization problems, or class-balanced undersampling which can result in information loss. In this paper, we present the TRAS (TRAnsfer and Share) to effectively utilize long-tailed semi-supervised data. TRAS transforms the imbalanced pseudo-label distribution of a traditional SSL model via a delicate function to enhance the supervisory signals for minority classes. It then transfers the distribution to a target model such that the minority class will receive significant attention. Interestingly, TRAS shows that more balanced pseudo-label distribution can substantially benefit minority-class training, instead of seeking to generate accurate pseudo-labels as in previous works. To simplify the approach, TRAS merges the training of the traditional SSL model and the target model into a single procedure by sharing the feature extractor, where both classifiers help improve the representation learning. According to extensive experiments, TRAS delivers much higher accuracy than state-of-the-art methods in the entire set of classes as well as minority classes.
翻译:长期保密的半保密学习(LTSSL)旨在从只有少数样本可以附加注释的班级平衡数据中学习。现有的解决方案通常需要大量成本来解决复杂的优化问题,或可能导致信息损失的班级平衡的抽样调查。在本文中,我们介绍TRAS(Transfer和Share),以有效利用长尾半监管数据。TRAS通过一个微妙的功能将传统SSL模型的不平衡的假标签分布转换为单一程序,以加强少数群体班级的监督信号。然后将分发转移到一个目标模型,让少数群体班得到极大关注。有趣的是,TRAS显示,更加平衡的假标签分配会大大有利于少数群体班的培训,而不是像以前的工作那样试图产生准确的假标签。为了简化方法,TRAS将传统SSL模型和目标模型的培训合并成一个单一的程序,分享特征提取器,让这两个分类者都帮助改进代表性学习。根据广泛的实验,TRASS提供比整个少数群体班级的州级和州级方法要高得多的准确性。