Deep neural networks have been shown to be very powerful methods for many supervised learning tasks. However, they can also easily overfit to training set biases, i.e., label noise and class imbalance. While both learning with noisy labels and class-imbalanced learning have received tremendous attention, existing works mainly focus on one of these two training set biases. To fill the gap, we propose \textit{Prototypical Classifier}, which does not require fitting additional parameters given the embedding network. Unlike conventional classifiers that are biased towards head classes, Prototypical Classifier produces balanced and comparable predictions for all classes even though the training set is class-imbalanced. By leveraging this appealing property, we can easily detect noisy labels by thresholding the confidence scores predicted by Prototypical Classifier, where the threshold is dynamically adjusted through the iteration. A sample reweghting strategy is then applied to mitigate the influence of noisy labels. We test our method on CIFAR-10-LT, CIFAR-100-LT and Webvision datasets, observing that Prototypical Classifier obtains substaintial improvements compared with state of the arts.
翻译:深神经网络被证明是许多受监督的学习任务非常强大的方法。 但是,它们也可以很容易地取代培训中设置的偏差,即标签噪音和阶级不平衡。虽然使用吵闹标签和课堂平衡学习受到极大关注,但现有的工程主要侧重于这两个培训偏差中的一个。为了填补这一空白,我们提议了“textit{Protodical分类仪 ”,它不需要由于嵌入网络而需要适当的额外参数。不同于偏向于头类的传统分类师,Protocical分类仪为各个班提供了平衡和可比的预测,即使培训组是班级平衡的。通过利用这一吸引人的特性,我们可以很容易地发现噪声标签,通过临界质谱分类仪预测的可信度分数,通过迭代法对门槛值进行动态调整。然后采用抽样重编战略来减轻噪音标签的影响。我们在CIRA-10-LT、CIFAR-100-LT和Webvision数据集上测试了我们的方法,我们观察到Prototogradication分类员获得与艺术状态相比的子级改进。