While much of recent study in semi-supervised learning (SSL) has achieved strong performance on single-label classification problems, an equally important yet underexplored problem is how to leverage the advantage of unlabeled data in multi-label classification tasks. To extend the success of SSL to multi-label classification, we first analyze with illustrative examples to get some intuition about the extra challenges exist in multi-label classification. Based on the analysis, we then propose PercentMatch, a percentile-based threshold adjusting scheme, to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training, as well as dynamic unlabeled loss weights that further reduces noise from early-stage unlabeled predictions. Without loss of simplicity, we achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
翻译:虽然在半监督学习(SSL)方面最近进行的许多研究在单一标签分类问题上取得了很强的成绩,但同样重要的是,一个同样重要但未得到充分探讨的问题是,如何在多标签分类任务中利用未标签数据的好处。为了将SSL的成功扩大到多标签分类,我们首先以说明性实例分析多标签分类中存在额外挑战的一些直觉。根据分析,我们然后提议以百分位为基础的阈值调整方案PerentMatch,以动态方式改变每个类培训中正和负伪标签的得分阈值,以及动态无标签损失重量,从而进一步减少早期未标签预测产生的噪音。没有简单化,我们就能在Pascal VOC2007和MS-CO数据集上取得强效,而与最近的SSL方法相比。