Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfitting property of a trained model to identify mislabeled samples. Specifically, our post-training approach gradually removes samples with high influence on the decision boundary and refines the decision boundary to improve generalization performance. Our post-training approach creates great synergies when combined with the existing LNL methods. Experimental results on various real-world and synthetic benchmark datasets demonstrate the validity of our approach in diverse realistic scenarios.
翻译:深度神经网络由于其高容量,很容易出现甚至噪声标签过拟合的情况,从而降低模型的泛化性能。为了克服这个问题,我们提出了一种新的从嘈杂标签学习(LNL)的方法,通过后训练可以显著提高任何在噪声标签数据上预训练模型的泛化性能。为此,我们利用已经训练过的模型的过拟合特性来识别错误标记的样本。具体来说,我们的后训练方法逐渐移除在决策边界上具有很高影响力的样本,并改善决策边界以提高泛化性能。结合现有的LNL方法,我们的后训练方法创造了很大的协同效应。各种真实世界和合成基准数据集上的实验结果证明,我们的方法在不同的真实场景中都是有效的。