Label smoothing (LS) is an arising learning paradigm that uses the positively weighted average of both the hard training labels and uniformly distributed soft labels. It was shown that LS serves as a regularizer for training data with hard labels and therefore improves the generalization of the model. Later it was reported LS even helps with improving robustness when learning with noisy labels. However, we observe that the advantage of LS vanishes when we operate in a high label noise regime. Puzzled by the observation, we proceeded to discover that several proposed learning-with-noisy-labels solutions in the literature instead relate more closely to negative label smoothing (NLS), which defines as using a negative weight to combine the hard and soft labels! We show that NLS differs substantially from LS in their achieved model confidence. To differentiate the two cases, we will call LS the positive label smoothing (PLS), and this paper unifies PLS and NLS into generalized label smoothing (GLS). We provide understandings for the properties of GLS when learning with noisy labels. Among other established properties, we theoretically show NLS is considered more beneficial when the label noise rates are high. We provide extensive experimental results on multiple benchmarks to support our findings too.
翻译:Label 滑动(LS)是一个新兴的学习范例,它使用硬培训标签和统一分布软标签的正加权平均值,显示LS是用硬标签培训数据的正规化器,因此改进了模型的概括性。后来,LS被报告在用吵闹标签学习时甚至有助于提高稳健性。然而,我们观察到,当我们在一个高标签噪音制度下运作时,LS的优势就会消失。通过观察,我们发现,文献中的几种建议学习用噪音标签的解决方案更接近于负面标签的平滑(NLS),它被界定为使用负重将硬标签和软标签结合起来!我们显示,NLS在获得的模型信心方面与LS有很大差异。为了区分这两个案例,我们将把积极的标签平滑动(PLS)和本文将PLS和NLS统一为平滑动的通用标签(GLS)。我们在学习噪音标签时对GLS的特性提供了理解。除其他属性外,我们在理论上显示NLS的实验性能比高。