Deep neural networks have shown impressive performance in supervised learning, enabled by their ability to fit well to the provided training data. However, their performance is largely dependent on the quality of the training data and often degrades in the presence of noise. We propose a principled approach for tackling label noise with the aim of assigning importance weights to individual instances and class labels. Our method works by formulating a class of constrained optimization problems that yield simple closed form updates for these importance weights. The proposed optimization problems are solved per mini-batch which obviates the need of storing and updating the weights over the full dataset. Our optimization framework also provides a theoretical perspective on existing label smoothing heuristics for addressing label noise (such as label bootstrapping). We evaluate our method on several benchmark datasets and observe considerable performance gains in the presence of label noise.
翻译:深神经网络在有监督的学习中表现出了令人印象深刻的成绩,这得益于其与所提供的培训数据相适应的能力,但是,其业绩在很大程度上取决于培训数据的质量,而且往往在出现噪音时会退化。我们提出了处理标签噪音的原则性办法,目的是为个别事件和类类标签分配重要权重。我们的方法是制定一类限制性优化问题,为这些重要重量提供简单的封闭式更新。拟议的优化问题每批小型批都得到解决,从而不需要储存和更新整个数据集的重量。我们的优化框架还从理论上审视现有标签在解决标签噪音方面平滑的超常性格(如标签靴子)。我们评估了我们关于几个基准数据集的方法,并观察到在标签噪音面前取得了相当大的绩效。