Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data when there is not a sufficient amount of labeled data available over the course of training. SSL methods based on Convolutional Neural Networks (CNNs) have recently provided successful results on standard benchmark tasks such as image classification. In this work, we consider the general setting of SSL problem where the labeled and unlabeled data come from the same underlying probability distribution. We propose a new approach that adopts an Optimal Transport (OT) technique serving as a metric of similarity between discrete empirical probability measures to provide pseudo-labels for the unlabeled data, which can then be used in conjunction with the initial labeled data to train the CNN model in an SSL manner. We have evaluated and compared our proposed method with state-of-the-art SSL algorithms on standard datasets to demonstrate the superiority and effectiveness of our SSL algorithm.
翻译:在培训过程中没有足够的贴标签数据的情况下,半暂停学习方法一直是使用未贴标签数据的一个有影响力的框架。基于进化神经网络(CNNs)的SSL方法最近为图像分类等标准基准任务提供了成功的结果。在这项工作中,我们认为标签和未贴标签数据来自相同基本概率分布的SSL问题的一般设置。我们建议采用一种新的方法,采用最佳运输技术,作为为未贴标签数据提供伪标签的经验概率措施之间的相似性指标,然后与最初贴标签的数据一起使用,以SLS方式培训CNN模型。我们评估并比较了我们所提议的方法与标准数据集方面最先进的SSL算法,以表明我们的SSL算法的优越性和有效性。