Incorrect labels in training data occur when human annotators make mistakes or when the data is generated via weak or distant supervision. It has been shown that complex noise-handling techniques - by modeling, cleaning or filtering the noisy instances - are required to prevent models from fitting this label noise. However, we show in this work that, for text classification tasks with modern NLP models like BERT, over a variety of noise types, existing noisehandling methods do not always improve its performance, and may even deteriorate it, suggesting the need for further investigation. We also back our observations with a comprehensive analysis.
翻译:培训数据中的错误标签出现在人类告示员犯错误时,或者当数据是通过薄弱或遥远的监督产生时,已经表明需要复杂的噪声处理技术(通过建模、清洁或过滤吵闹事件)防止模型安装这种标签噪音。然而,我们在这项工作中显示,对于现代NLP模型(如BERT)的文本分类任务,在各种噪音类型上,现有的噪声处理方法并不总是能改进其性能,甚至可能恶化,表明需要进一步调查。我们还以全面分析来支持我们的意见。