Pseudo labeling and consistency regularization approaches with confidence-based thresholding have made great progress in semi-supervised learning (SSL). In this paper, we theoretically and empirically analyze the relationship between the unlabeled data distribution and the desirable confidence threshold. Our analysis shows that previous methods might fail to define favorable threshold since they either require a pre-defined / fixed threshold or an ad-hoc threshold adjusting scheme that does not reflect the learning effect well, resulting in inferior performance and slow convergence, especially for complicated unlabeled data distributions. We hence propose \emph{FreeMatch} to define and adjust the confidence threshold in a self-adaptive manner according to the model's learning status. To handle complicated unlabeled data distributions more effectively, we further propose a self-adaptive class fairness regularization method that encourages the model to produce diverse predictions during training. Extensive experimental results indicate the superiority of FreeMatch especially when the labeled data are extremely rare. FreeMatch achieves \textbf{5.78}\%, \textbf{13.59}\%, and \textbf{1.28}\% error rate reduction over the latest state-of-the-art method FlexMatch on CIFAR-10 with 1 label per class, STL-10 with 4 labels per class, and ImageNet with 100k labels respectively.
翻译:使用基于信任的阈值的Psedo标签和一致性规范化方法在半监督性学习(SSL)方面取得了很大进展。 在本文中,我们从理论上和经验上分析了未贴标签数据分布与理想信任阈值之间的关系。我们的分析表明,以往的方法可能无法界定有利的阈值,因为它们要么需要预先界定/固定阈值,要么需要不反映学习效果的附加阈值调整办法,导致业绩低劣和趋同缓慢,特别是对于复杂的未贴标签的数据分布而言。因此,我们提议根据模型的学习状态,以自我调整的方式界定和调整信任阈值。为了更有效地处理复杂的未贴标签数据分布与理想信任阈值。我们进一步建议了一种鼓励模型在培训期间产生不同预测的自我调整级公平化方法。广泛的实验结果表明自由匹配的优越性,特别是在标签数据极为罕见的情况下。 FreeMatch 将达到\ textbf{5.78},\ textff/13.59\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\1\\\\\\\\\\\\\\\1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\