Training deep neural networks (DNNs) with noisy labels is a challenging problem due to over-parameterization. DNNs tend to essentially fit on clean samples at a higher rate in the initial stages, and later fit on the noisy samples at a relatively lower rate. Thus, with a noisy dataset, the test accuracy increases initially and drops in the later stages. To find an early stopping point at the maximum obtainable test accuracy (MOTA), recent studies assume either that i) a clean validation set is available or ii) the noise ratio is known, or, both. However, often a clean validation set is unavailable, and the noise estimation can be inaccurate. To overcome these issues, we provide a novel training solution, free of these conditions. We analyze the rate of change of the training accuracy for different noise ratios under different conditions to identify a training stop region. We further develop a heuristic algorithm based on a small-learning assumption to find a training stop point (TSP) at or close to MOTA. To the best of our knowledge, our method is the first to rely solely on the \textit{training behavior}, while utilizing the entire training set, to automatically find a TSP. We validated the robustness of our algorithm (AutoTSP) through several experiments on CIFAR-10, CIFAR-100, and a real-world noisy dataset for different noise ratios, noise types, and architectures.
翻译:使用噪音标签的深层神经网络(DNN)培训是一个挑战性的问题,原因是超度参数化。 DNN基本上在初始阶段就以较高的比例适合清洁样品,后来则以相对较低的比例适应噪音样品。因此,由于数据组的噪音,测试精度最初会提高,在后阶段会下降。为了在最高可获取测试精度(MOTA)找到早期停止点,最近的研究假设是,一)有一个干净的验证装置,或者(二)噪音比率是已知的,或者两者都存在。然而,通常没有干净的验证装置,噪音估计也可能不准确。为了克服这些问题,我们提供了一种新的培训解决办法,而没有这些条件。我们分析了不同条件下不同噪音比率的培训精确率的变化率,以识别培训停止区。我们进一步根据一个小学习假设来开发超常值算法,以找到在MOTA(TP)或接近MOTA(M)的训练站点点。我们最了解的方法是首先依靠Textitalitivorate, 并且利用整个TRA(我们100-FAR)的精确的模型模型,我们通过一个可靠的模型,一个稳定的CIS-CAS-CAS-CAS-CIS-CRAS-C-C-C-C-CAR-CAS-CAR-CAR-CAS-CAS-CAS-CAS-CAR-CAR-CAR-CAR-CAR-CAR-CAR-CAS-CAS-CAS-CAS-CAR-CAR-CAS-CAS-CAS-CAS-CAS-CLAR-CLDAR-CAS-DAR-CAS-CAS-DAR-DAR-CAS-CAR-CAS-C-C-C-CAS-CAS-CAS-CAS-CAS-CAS-CAS-DAR-CAS-CAS-CAT-CLAR-C-CLAR-CAS-C-C-CAS-DAR-CAS-DAR-C-C-C-C-CAS-CAS-CAS-CAS-DAR-DAR-DAR-DAR-DAR-C-C-DAR-C-C-C-C-C-