Imm平衡半监督学习的分布式软件- 以语义为主的 Pseudo 标签 (Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning)

The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application since they do not consider (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data. This paper addresses such a relatively under-explored problem, imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance. Interestingly, we find that the semantic pseudo-labels from a similarity-based classifier in feature space and the traditional pseudo-labels from the linear classifier show the complementary property. To this end, we propose a general pseudo-labeling framework to address the bias motivated by this observation. The key idea is to class-adaptively blend the semantic pseudo-label to the linear one, depending on the current pseudo-label distribution. Thereby, the increased semantic pseudo-label component suppresses the false positives in the majority classes and vice versa. We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label. Extensive evaluation on CIFAR10/100-LT and STL10-LT shows that DASO consistently outperforms both recently proposed re-balancing methods for label and pseudo-label. Moreover, we demonstrate that typical SSL algorithms can effectively benefit from unlabeled data with DASO, especially when (1) class imbalance and (2) class distribution mismatch exist and even on recent real-world Semi-Aves benchmark.

翻译：传统的半监督学习(SSL) 方法的能力远非真实世界应用, 因为它们不考虑 (1) 类不平衡和 (2) 标签和未标签数据之间的类分配不匹配。本文处理的是一个探索不足的问题, 不平衡的半监督学习, 严重偏差的假标签会损害模型的性能。有趣的是, 我们发现地貌空间中基于相似的分类器和线性分类器的传统假标签显示的是互补的属性。为此, 我们提出一个通用的假标签框架, 以解决由此观察驱动的偏差。关键的想法是将语义化的假标签与线性化标签混为一体, 取决于目前的假标签分布。因此, 语义化假标签部分的增加抑制了多数类的假正数, 反之亦然。我们称用于不平衡的 SSLSL( DASSO) 的无伪标签框架, 包括分布- 软件- Orented (DASO) 平级标签( estedododododo), 最近提出的SLA- recal- laudal- laveal- dal- labal- dal- dal- dal- dal- lab- lab- labs, 最近显示我们不断的SAL- sal- sal- 和DALTLTLTLTLTLTLTLTLTLTA- 和DAS- sal- sal- sal- sal- sal- sal- sal- sal- slev 和DADAR- sal- sal- sal- sal- sal- sal- sl) 和DAR- sal- sal- sal- sal- sl) 和。