Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfitting property of a trained model to identify mislabeled samples. Specifically, our post-training approach gradually removes samples with high influence on the decision boundary and refines the decision boundary to improve generalization performance. Our post-training approach creates great synergies when combined with the existing LNL methods. Experimental results on various real-world and synthetic benchmark datasets demonstrate the validity of our approach in diverse realistic scenarios.
翻译:深度神经网络由于其高容量很容易过拟合,甚至对噪声标签进行过拟合,从而降低模型的泛化性能。为了克服这个问题,我们提出了一种新的噪声标签学习的后训练方法,可以显著地提高任何预训练模型在噪声标签数据上的泛化性能。为此,我们利用训练模型的过拟合属性来识别标记错误的样本。具体地说,我们的后训练方法逐步删除对决策边界影响大的样本,并细化决策边界以提高泛化性能。当与现有的噪声标签学习方法结合使用时,我们的后训练方法具有很强的协同作用。在各种真实世界和合成基准数据集上的实验结果表明,我们的方法在不同的真实场景中具有有效性。