Annotating the dataset with high-quality labels is crucial for performance of deep network, but in real world scenarios, the labels are often contaminated by noise. To address this, some methods were proposed to automatically split clean and noisy labels, and learn a semi-supervised learner in a Learning with Noisy Labels (LNL) framework. However, they leverage a handcrafted module for clean-noisy label splitting, which induces a confirmation bias in the semi-supervised learning phase and limits the performance. In this paper, we for the first time present a learnable module for clean-noisy label splitting, dubbed SplitNet, and a novel LNL framework which complementarily trains the SplitNet and main network for the LNL task. We propose to use a dynamic threshold based on a split confidence by SplitNet to better optimize semi-supervised learner. To enhance SplitNet training, we also present a risk hedging method. Our proposed method performs at a state-of-the-art level especially in high noise ratio settings on various LNL benchmarks.
翻译:以高质量的标签说明数据集对于深层次网络的运作至关重要,但在现实世界中,标签往往受到噪音的污染。为了解决这个问题,建议了一些方法,自动分割清洁和吵闹标签,在使用噪音标签(LNL)的学习框架中学习半监督的学习者。然而,它们利用手工制作的清洁噪音标签分割模块,在半监督的学习阶段引起确认偏差,并限制性能。在本文中,我们首次提出了一个清洁噪音标签分割的可学习模块,称为Splipped SplitNet,以及一个新的LNL框架,作为SplitNet和LNL任务主要网络培训的补充。我们提议使用基于分裂信心的动态阈值,通过SlitNet更好地优化半监督的学习者。为了加强SlipNet培训,我们还提出了一种风险规避方法。我们提议的方法是在各种LNL基准的高噪音比率环境中的状态下进行演练。