Learning from noisy data is a challenging task that significantly degenerates the model performance. In this paper, we present TCL, a novel twin contrastive learning model to learn robust representations and handle noisy labels for classification. Specifically, we construct a Gaussian mixture model (GMM) over the representations by injecting the supervised model predictions into GMM to link label-free latent variables in GMM with label-noisy annotations. Then, TCL detects the examples with wrong labels as the out-of-distribution examples by another two-component GMM, taking into account the data distribution. We further propose a cross-supervision with an entropy regularization loss that bootstraps the true targets from model predictions to handle the noisy labels. As a result, TCL can learn discriminative representations aligned with estimated labels through mixup and contrastive learning. Extensive experimental results on several standard benchmarks and real-world datasets demonstrate the superior performance of TCL. In particular, TCL achieves 7.5\% improvements on CIFAR-10 with 90\% noisy label -- an extremely noisy scenario. The source code is available at \url{https://github.com/Hzzone/TCL}.
翻译:从吵闹的数据中学习是一项艰巨的任务,它使模型性能大为恶化。在本文中,我们展示了TCL,这是一个全新的双对比学习模式,以学习强健的演示和处理吵闹的标签。具体地说,我们通过将受监督的模型预测输入GMM,将GMM中无标签的潜伏变量与标签性说明联系起来,从而在演示中建立一个高斯混合物模型模型(GMMM),将GMM的无标签潜在变量与标签性爱说明联系起来。然后,TCLL通过另外两个组成部分的GMM,将错误的标签作为分配之外的例子,通过另外两个组成部分的GMMM,检测出一些错误的例子。我们进一步提议了一种带有螺旋式正规化损失的交叉监督功能,从模型预测中套出真正的目标,以便处理吵闹闹的标签。因此,TCLLL能够通过混合和对比性学习来学习与估计的标签相一致的歧视性表述。关于若干标准基准和现实世界数据集的广泛实验结果显示了TCL的优异性表现。特别是,TLLLLLLLS在CIFAR-10上实现了7.5+90的改进了CFAR-10,一个噪音标签 -- -- -- 一种极为吵闹乱的情景。源代码可以在\/HZ.Hzz_/H.HTTTT.</s>