Training sample re-weighting is an effective approach for tackling data biases such as imbalanced and corrupted labels. Recent methods develop learning-based algorithms to learn sample re-weighting strategies jointly with model training based on the frameworks of reinforcement learning and meta learning. However, depending on additional unbiased reward data is limiting their general applicability. Furthermore, existing learning-based sample re-weighting methods require nested optimizations of models and weighting parameters, which requires expensive second-order computation. This paper addresses these two problems and presents a novel learning-based fast sample re-weighting (FSR) method that does not require additional reward data. The method is based on two key ideas: learning from history to build proxy reward data and feature sharing to reduce the optimization cost. Our experiments show the proposed method achieves competitive results compared to state of the arts on label noise robustness and long-tailed recognition, and does so while achieving significantly improved training efficiency. The source code is publicly available at https://github.com/google-research/google-research/tree/master/ieg.
翻译:培训样本的重新加权是解决数据偏差,例如不平衡和腐败标签等数据偏差的一种有效方法。最近的方法发展了基于学习的算法,与基于强化学习和元学习框架的示范培训共同学习抽样再加权战略;然而,取决于更多的不偏倚的奖励数据,限制了其普遍适用性。此外,现有的基于学习的抽样再加权方法要求对模型和加权参数进行嵌套优化,这需要昂贵的第二阶计算。本文件述及这两个问题,并提出了一种新的基于学习的快速抽样再加权方法,不需要额外的奖励数据。该方法基于两个关键概念:从历史中学习建立代理奖励数据,并分享特征以减少优化成本。我们的实验显示,拟议方法取得了与标签噪声稳健和长尾声识别艺术状况相比的竞争结果,同时大大提高了培训效率。源码可在https://github.com/google-reearch-gogle-research/tree/master/ieg查阅。