The presence of label noise often misleads the training of deep neural networks. Departing from the recent literature which largely assumes the label noise rate is only determined by the true label class, the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks, resulting in settings with instance-dependent label noise. We first provide evidences that the heterogeneous instance-dependent label noise is effectively down-weighting the examples with higher noise rates in a non-uniform way and thus causes imbalances, rendering the strategy of directly applying methods for class-dependent label noise questionable. Built on a recent work peer loss [24], we then propose and study the potentials of a second-order approach that leverages the estimation of several covariance terms defined between the instance-dependent noise rates and the Bayes optimal label. We show that this set of second-order statistics successfully captures the induced imbalances. We further proceed to show that with the help of the estimated second-order statistics, we identify a new loss function whose expected risk of a classifier under instance-dependent label noise is equivalent to a new problem with only class-dependent label noise. This fact allows us to apply existing solutions to handle this better-studied setting. We provide an efficient procedure to estimate these second-order statistics without accessing either ground truth labels or prior knowledge of the noise rates. Experiments on CIFAR10 and CIFAR100 with synthetic instance-dependent label noise and Clothing1M with real-world human label noise verify our approach. Our implementation is available at https://github.com/UCSC-REAL/CAL.
翻译:标签噪音的存在往往会误导深层神经网络的培训。从最近文献中脱颖而出,主要认为标签噪音率只是由真正的标签类别决定的,而人类附加说明标签中的错误更可能取决于任务的难度程度,导致根据实例设定的标签噪音;我们首先提供证据表明,因具体情况而异的标签噪音以不统一的方式有效地压低了噪音率高的例子,从而造成不平衡,使得直接应用依赖等级的标签噪音方法的战略令人怀疑。在近期的工作同行损失上[24],我们然后提议并研究二阶方法的潜力,利用根据实例设定的噪音率和巴伊斯最佳标签之间界定的若干变异术语的估算。我们首先提供证据表明,这组基于实例的标签以不统一的方式对数字进行加权加权,结果显示,在估计的第二阶梯统计数字的帮助下,我们发现了一种新的损失函数,根据实例标签确定一个分类的标签的预期风险相当于一个新例子[24],我们随后提议并研究采用第二阶梯方法,利用仅基于类别而具有可靠性的标签的标签比率进行新的问题。这个事实使得我们能够将现有的真理数据库用于实地评估。