Deep learning has outperformed other machine learning algorithms in a variety of tasks, and as a result, it has become more and more popular and used. However, as other machine learning algorithms, deep learning, and convolutional neural networks (CNNs) in particular, perform worse when the data sets present label noise. Therefore, it is important to develop algorithms that help the training of deep networks and their generalization to noise-free test sets. In this paper, we propose a robust training strategy against label noise, called RAFNI, that can be used with any CNN. This algorithm filters and relabels instances of the training set based on the predictions and their probabilities made by the backbone neural network during the training process. That way, this algorithm improves the generalization ability of the CNN on its own. RAFNI consists of three mechanisms: two mechanisms that filter instances and one mechanism that relabels instances. In addition, it does not suppose that the noise rate is known nor does it need to be estimated. We evaluated our algorithm using different data sets of several sizes and characteristics. We also compared it with state-of-the-art models using the CIFAR10 and CIFAR100 benchmarks under different types and rates of label noise and found that RAFNI achieves better results in most cases.
翻译:深层次的学习比其他机器学习算法在各种任务中表现得要好,因此,它越来越受欢迎和使用。然而,与其他机器学习算法、深层次的学习和进化神经网络(CNNs)一样,当数据集显示标签噪音时,其效果更差。因此,重要的是开发算法,帮助培训深层网络,将其推广到无噪声测试机中。在本文中,我们提出了一个强有力的培训战略,防止任何CNN都可以使用的称为RAFNI的标签噪音。这个算法过滤器和重新标注了基于主干神经网络在培训过程中作出的预测及其概率的培训。这个算法提高了CNN本身的普及能力。RAFNI由三个机制组成:两个过滤机制,一个机制是再贴无噪声测试机。此外,我们并不认为噪音率是已知的,也不需要估计。我们使用不同大小和特征的数据集对我们的算法进行了评估。我们还用不同大小和特性的不同数字进行了比较,在IRAFR和IFA模型中,我们用最先进的价格和最差的IRAFAR模型进行了比较。