Partial label learning (PLL) is a typical weakly supervised learning, where each sample is associated with a set of candidate labels. The basic assumption of PLL is that the ground-truth label must reside in the candidate set. However, this assumption may not be satisfied due to the unprofessional judgment of the annotators, thus limiting the practical application of PLL. In this paper, we relax this assumption and focus on a more general problem, noisy PLL, where the ground-truth label may not exist in the candidate set. To address this challenging problem, we propose a novel framework called "Iterative Refinement Network (IRNet)". It aims to purify the noisy samples by two key modules, i.e., noisy sample detection and label correction. Ideally, we can convert noisy PLL into traditional PLL if all noisy samples are corrected. To guarantee the performance of these modules, we start with warm-up training and exploit data augmentation to reduce prediction errors. Through theoretical analysis, we prove that IRNet is able to reduce the noise level of the dataset and eventually approximate the Bayes optimal classifier. Experimental results on multiple benchmark datasets demonstrate the effectiveness of our method. IRNet is superior to existing state-of-the-art approaches on noisy PLL.
翻译:部分标签学习( PLL) 是典型的薄弱监管学习( PLL), 每个样本都与一组候选标签相关。 PLL 的基本假设是, 地面真相标签必须保存在候选标签中。 但是, 由于批注者不专业的判断, 这一假设可能无法满足, 从而限制了 PLL 的实际应用 。 在本文中, 我们放松这一假设, 并关注一个更普遍的问题, 噪音 PLL, 候选人组可能不存在地面真相标签 。 为了解决这个具有挑战性的问题, 我们提议了一个叫作“ 临时精炼网( IRNet) ” 的新框架 。 它的目的是用两个关键模块净化噪音样本, 即噪音样本检测和标签校正。 理想的情况是, 如果所有噪音样本都得到纠正, 我们可以将噪音的 PLLL 转换为传统的 PLLL 。 我们从暖化训练开始, 利用数据增强来减少预测错误。 我们通过理论分析, 我们证明 IRNet 能够降低数据集的噪音水平, 并最终接近 Bayes 最佳的 Pliglesteralal- laftal 方法 。