This paper aims to provide understandings for the effect of an over-parameterized model, e.g. a deep neural network, memorizing instance-dependent noisy labels. We first quantify the harms caused by memorizing noisy instances, and show the disparate impacts of noisy labels for sample instances with different representation frequencies. We then analyze how several popular solutions for learning with noisy labels mitigate this harm at the instance level. Our analysis reveals that existing approaches lead to disparate treatments when handling noisy instances. While higher-frequency instances often enjoy a high probability of an improvement by applying these solutions, lower-frequency instances do not. Our analysis reveals new understandings for when these approaches work, and provides theoretical justifications for previously reported empirical observations. This observation requires us to rethink the distribution of label noise across instances and calls for different treatments for instances in different regimes.
翻译:本文旨在提供对过度参数化模型影响的理解,例如深神经网络、记忆以实例为依存的吵闹标签。 我们首先量化由记忆噪音事件造成的伤害,并用不同代表频率来显示噪音标签对抽样案例的不同影响。 然后我们分析用噪音标签学习的几种流行解决方案如何在实例一级减轻这种伤害。 我们的分析显示,在处理吵闹事件时,现有方法导致不同的治疗。 高频案例往往通过应用这些解决方案而有很大的改进可能性,但低频率案例却并非如此。 我们的分析揭示了这些方法发挥作用时的新理解,并为先前报告的经验性观察提供了理论依据。 观察要求我们重新考虑标签噪音在不同案例中的分布,并要求在不同制度中的不同情况下采用不同的处理方法。