The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application due to severely biased pseudo-labels caused by (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data. This paper addresses such a relatively under-explored problem. First, we propose a general pseudo-labeling framework that class-adaptively blends the semantic pseudo-label from a similarity-based classifier to the linear one from the linear classifier, after making the observation that both types of pseudo-labels have complementary properties in terms of bias. We further introduce a novel semantic alignment loss to establish balanced feature representation to reduce the biased predictions from the classifier. We term the whole framework as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label. We conduct extensive experiments in a wide range of imbalanced benchmarks: CIFAR10/100-LT, STL10-LT, and large-scale long-tailed Semi-Aves with open-set class, and demonstrate that, the proposed DASO framework reliably improves SSL learners with unlabeled data especially when both (1) class imbalance and (2) distribution mismatch dominate.
翻译:传统的半监督学习(SSL)方法的能力远非现实世界应用,因为(1) 类不平衡和(2) 类分布在标签和未标签数据之间造成严重偏差的伪标签,造成严重偏颇的伪标签,(1) 类不平衡和(2) 类分布分布不匹配,本文处理这种探索不足的问题。首先,我们提出一个普通的伪标签框架,在观察到两种类伪标签在偏向方面具有互补的特性之后,将类似基于相似性的分类器与线性分类器的线性分类器混合起来,然后发现这两种类类假标签在偏向性方面具有互补的特性。我们进一步引入了一个新的语义一致损失,以建立平衡的特征代表制,减少分类者作出的偏差预测。我们称整个框架为分发软件软件(DASO) Proedo-labed(DASO) Pseudo-labed(DASO) 标签。我们在广泛的不平衡基准范围内进行广泛的实验: CIFAR10/100-LT、STL10-LT和大规模长期成型半成型长期成型分类的成型分类,并显示提议的DSOSO框架可靠地改进了SSL学生学习者与无标签的不匹配的数据。