Noisy labels are inevitable yet problematic in machine learning society. It ruins the generalization power of a classifier by making the classifier be trained to be overfitted to wrong labels. Existing methods on noisy label have focused on modifying classifier training procedure. It results in two possible problems. First, these methods are not applicable to a pre-trained classifier without further access into training. Second, it is not easy to train a classifier and remove all of negative effects from noisy labels simultaneously. From these problems, we suggests a new branch of approach, Noisy Prediction Calibration (NPC) in learning with noisy labels. Through the introduction and estimation of a new type of transition matrix via generative model, NPC corrects the noisy prediction from the pre-trained classifier to the true label as a post-processing scheme. We prove that NPC theoretically aligns with the transition matrix based methods. Yet, NPC provides more accurate pathway to estimate true label, even without involvement in classifier learning. Also, NPC is applicable to any classifier trained with noisy label methods, if training instances and its predictions are available. Our method, NPC, boosts the classification performances of all baseline models on both synthetic and real-world datasets.
翻译:噪音标签在机器学习社会中是不可避免的,但却是问题。它破坏了分类者的普及能力,使分类者被训练为过度适应错误的标签。关于噪音标签的现有方法侧重于修改分类者培训程序。它造成两个可能的问题。首先,这些方法不适用于没有进一步接受培训的预先训练的分类者。第二,训练一个分类者并同时消除来自噪音标签的所有负面影响并非易事。从这些问题中,我们建议了一个新的方法分支,即用噪音标签学习的噪音预测校准(NPC)。通过引进和估计一种新型的过渡矩阵,通过基因化模型,NPC纠正从经过训练的分类者到作为后处理办法的真正标签的噪音预测。我们证明NPC理论上与基于过渡矩阵的方法一致。然而,NPC提供了更准确的估算真实标签的方法,即使不参与分类者学习。此外,NPC也适用于任何经过噪音标签方法培训的分类者,如果培训实例及其预测都存在的话,NPC,则适用于任何经过噪音标签方法培训的分类者。我们的方法、NPC、所有合成基准的推进器,以及所有数据都用于合成世界的分类。