Semi-supervised learning (SSL) has played an important role in leveraging unlabeled data when labeled data is limited. One of the most successful SSL approaches is based on consistency regularization, which encourages the model to produce unchanged with perturbed input. However, there has been less attention spent on inputs that have the same label. Motivated by the observation that the inputs having the same label should have the similar model outputs, we propose a novel method, RankingMatch, that considers not only the perturbed inputs but also the similarity among the inputs having the same label. We especially introduce a new objective function, dubbed BatchMean Triplet loss, which has the advantage of computational efficiency while taking into account all input samples. Our RankingMatch achieves state-of-the-art performance across many standard SSL benchmarks with a variety of labeled data amounts, including 95.13% accuracy on CIFAR-10 with 250 labels, 77.65% accuracy on CIFAR-100 with 10000 labels, 97.76% accuracy on SVHN with 250 labels, and 97.77% accuracy on SVHN with 1000 labels. We also perform an ablation study to prove the efficacy of the proposed BatchMean Triplet loss against existing versions of Triplet loss.
翻译:在标签数据有限的情况下,半监督的学习(SSL)在利用未贴标签的数据方面发挥了重要作用。最成功的SSL方法之一是一致性规范,鼓励模型在不动的投入下不动。然而,对具有相同标签的投入的注意较少。由于观察到具有相同标签的投入应该具有相似的模型产出,我们提议了一个创新方法,即排名Match,不仅考虑受扰动的投入,而且考虑具有相同标签的输入的相似性。我们特别引入了一个新的目标功能,即所谓的BatchMean Triplet损失,它具有计算效率的优势,同时考虑到所有输入样本。我们的排名Match在许多标准SSL基准中实现了最先进的性能,有各种标签数据数量,包括具有250个标签的CIFAR-10的精确度95.13%,有10 000个标签的CIFAR-100的精确度77.65%,有250个标签的SVHN的精确度97.76%,以及SVHNTriple的精确度为97.77%,同时考虑所有输入样本的SVHNlet的精确度。我们还进行了一项拟议的SVHMTripplex的测试。