Label noise and class imbalance are two major issues coexisting in real-world datasets. To alleviate the two issues, state-of-the-art methods reweight each instance by leveraging a small amount of clean and unbiased data. Yet, these methods overlook class-level information within each instance, which can be further utilized to improve performance. To this end, in this paper, we propose Generalized Data Weighting (GDW) to simultaneously mitigate label noise and class imbalance by manipulating gradients at the class level. To be specific, GDW unrolls the loss gradient to class-level gradients by the chain rule and reweights the flow of each gradient separately. In this way, GDW achieves remarkable performance improvement on both issues. Aside from the performance gain, GDW efficiently obtains class-level weights without introducing any extra computational cost compared with instance weighting methods. Specifically, GDW performs a gradient descent step on class-level weights, which only relies on intermediate gradients. Extensive experiments in various settings verify the effectiveness of GDW. For example, GDW outperforms state-of-the-art methods by $2.56\%$ under the $60\%$ uniform noise setting in CIFAR10. Our code is available at https://github.com/GGchen1997/GDW-NIPS2021.
翻译:为了缓解这两个问题,最先进的方法通过利用少量的清洁和无偏倚数据对每个案例进行加权。然而,这些方法忽略了每个案例的等级信息,可以进一步利用这些信息来改善业绩。为此,我们在本文件中提议,通用数据加权(GDW)通过在级别一级操纵梯度,同时减少标签噪音和阶级不平衡。具体地说,GDW通过链规则将损失梯度推至等级梯度,并分别对每个梯度的流量进行重新加权。通过这种方式,GDW在这两个问题上都取得了显著的业绩改进。除了业绩增益之外,GDW还有效地获得了等级加权,而没有引入与加权方法相比的任何额外的计算成本。具体地说,GDW在等级20重量上采取了梯度梯度下降步骤,仅依靠中间梯度。不同场合的广泛实验核实GDW的效能。例如,GDW$超越了GDW在1997年州/GAFAFAR标准下,在60美元/GAFAS标准之下,在AS_BAR_BS_xxxxxxxxxxxxxxxxxxxxxxxxxxx