Label smoothing (LS) is an arising learning paradigm that uses the positively weighted average of both the hard training labels and uniformly distributed soft labels. It was shown that LS serves as a regularizer for training data with hard labels and therefore improves the generalization of the model. Later it was reported LS even helps with improving robustness when learning with noisy labels. However, we observe that the advantage of LS vanishes when we operate in a high label noise regime. Puzzled by the observation, we proceeded to discover that several proposed learning-with-noisy-labels solutions in the literature instead relate more closely to negative label smoothing (NLS), which defines as using a negative weight to combine the hard and soft labels! We show that NLS functions substantially differently from LS in their achieved model confidence. To differentiate the two cases, we will call LS the positive label smoothing (PLS), and this paper unifies PLS and NLS into generalized label smoothing (GLS). We provide understandings for the properties of GLS when learning with noisy labels. Among other established properties, we theoretically show NLS is considered more beneficial when the label noise rates are high. We provide experimental results to support our findings too.
翻译:Label 滑动(LS)是一个新兴的学习范例,它使用硬培训标签和统一分布软标签的正加权平均值,显示LS是用硬标签培训数据的正规化器,因此改进了模型的概括性。后来,LS被报告在学习吵闹标签时甚至有助于提高稳健性。然而,我们观察到,当我们在一个高标签噪音制度下运作时,LS的优势就会消失。通过观察,我们发现,文献中的几种建议学习用噪音标签的解决方案与负标签平滑(NLS)关系更为密切,而消极标签平滑(NLS)则被定义为使用负重来结合硬标签和软标签的常规化。我们从理论上显示NLS的功能与LS的功能有很大不同。为了区分这两个案例,我们将将积极的标签平滑动(PLS)和本文将PLS和NLS统一为通用标签的平滑动(GLS)。我们在学习噪音标签时为GLS的特性提供了理解。除其他属性外,我们从理论上显示NLS的实验性结果,我们认为,当噪音时会更有利于。