Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfitting property of a trained model to identify mislabeled samples. Specifically, our post-training approach gradually removes samples with high influence on the decision boundary and refines the decision boundary to improve generalization performance. Our post-training approach creates great synergies when combined with the existing LNL methods. Experimental results on various real-world and synthetic benchmark datasets demonstrate the validity of our approach in diverse realistic scenarios.
翻译:深度神经网络由于其高容量可以很容易地过拟合甚至嘈杂的标签,从而降低模型的泛化性能。为了克服这个问题,我们提出了一种新的学习嘈杂标签(LNL)的方法进行后训练,可以显著提高任何预训练模型在嘈杂标签数据上的泛化性能。为此,我们利用训练好的模型的过拟合属性来识别标错的样本。具体而言,我们的后训练方法逐步移除对决策边界具有较高影响力的样本并优化决策边界以提高泛化性能。我们的后训练方法与现有的LNL方法相结合可以产生很好的协同效应。在各种真实和合成基准数据集上的实验结果展示了我们的方法在不同的现实场景中的有效性。