Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance. To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e.g., easy to hard) sequence. Previous work takes incorrect samples as generic hard ones without discriminating between hard samples (i.e., hard samples in correct data) and incorrect samples. Indeed, a model should learn from hard samples to promote generalization rather than overfit to incorrect ones. In this paper, we address this problem by appending a novel loss function DiscrimLoss, on top of the existing task loss. Its main effect is to automatically and stably estimate the importance of easy samples and difficult samples (including hard and incorrect samples) at the early stages of training to improve the model performance. Then, during the following stages, DiscrimLoss is dedicated to discriminating between hard and incorrect samples to improve the model generalization. Such a training strategy can be formulated dynamically in a self-supervised manner, effectively mimicking the main principle of curriculum learning. Experiments on image classification, image regression, text sequence regression, and event relation reasoning demonstrate the versatility and effectiveness of our method, particularly in the presence of diversified noise levels.
翻译:鉴于存在标签噪音的数据(即不正确的数据),深神经网络将逐渐记住标签噪音,损害模型性能。为了缓解这一问题,建议课程学习,通过以有意义的(容易到硬的)顺序订购培训样本,改进模型性能和概括性;以往的工作将不正确的样品作为通用硬样品,而不区分硬样品(即正确数据中的硬样品)和不正确的样品。事实上,一个模型应当从硬样品中学习,以促进一般化,而不是过度适应不正确的样品。在本文中,我们可以通过在现有任务损失之外添加一个新的损失函数分辨来解决这一问题。其主要效果是自动和准确地估计简单样品和困难样品(包括硬的和不正确的样品)在培训的早期阶段的重要性,以改进模型性能。随后,Disrim Loves致力于区分硬样品和不正确的样品,以改进模型性能。在本文中,这种培训战略可以动态地制定,以自我控制的方式,在现有的任务损失中添加新的损失函数分辨分辨。其主要效果是自动和尖刻估计简单样品和难的样品(包括硬的样品和不正确的样品)在早期培训阶段的重要性。然后,实验性定序中学习我们图像的层次和推理法。