Semi-supervised learning (SSL) has recently proven to be an effective paradigm for leveraging a huge amount of unlabeled data while mitigating the reliance on large labeled data. Conventional methods focused on extracting a pseudo label from individual unlabeled data sample and thus they mostly struggled to handle inaccurate or noisy pseudo labels, which degenerate performance. In this paper, we address this limitation with a novel SSL framework for aggregating pseudo labels, called AggMatch, which refines initial pseudo labels by using different confident instances. Specifically, we introduce an aggregation module for consistency regularization framework that aggregates the initial pseudo labels based on the similarity between the instances. To enlarge the aggregation candidates beyond the mini-batch, we present a class-balanced confidence-aware queue built with the momentum model, encouraging to provide more stable and consistent aggregation. We also propose a novel uncertainty-based confidence measure for the pseudo label by considering the consensus among multiple hypotheses with different subsets of the queue. We conduct experiments to demonstrate the effectiveness of AggMatch over the latest methods on standard benchmarks and provide extensive analyses.
翻译:半监督的学习(SSL)最近被证明是利用大量未贴标签的数据来减少对大标签数据依赖的有效范例。常规方法侧重于从个人未贴标签的数据样本中提取假标签,因此它们大多努力处理不准确或吵闹的假标签,这有损性能。在本文中,我们用名为 AggMatch 的新的SSL 组合伪标签框架来应对这一限制,这个框架利用不同的自信实例来改进初始假标签。具体地说,我们引入了一个一致性正规化框架汇总模块,将最初的假标签根据相似性汇总在一起。为了将集合对象扩大到微型批量之外,我们展示了与动力模型一起构建的阶级平衡的自信心队列,鼓励提供更稳定和一致的聚合。我们还提出一个新的基于不确定性的假标签信任度衡量方法,即考虑不同队列子的多个假设之间的共识。我们进行了实验,以证明AggMatch对标准基准的最新方法的有效性并提供广泛的分析。