Practical natural language processing (NLP) tasks are commonly long-tailed with noisy labels. Those problems challenge the generalization and robustness of complex models such as Deep Neural Networks (DNNs). Some commonly used resampling techniques, such as oversampling or undersampling, could easily lead to overfitting. It is growing popular to learn the data weights leveraging a small amount of metadata. Besides, recent studies have shown the advantages of self-supervised pre-training, particularly to the under-represented data. In this work, we propose a general framework to handle the problem of both long-tail and noisy labels. The model is adapted to the domain of problems in a contrastive learning manner. The re-weighting module is a feed-forward network that learns explicit weighting functions and adapts weights according to metadata. The framework further adapts weights of terms in the loss function through a combination of the polynomial expansion of cross-entropy loss and focal loss. Our extensive experiments show that the proposed framework consistently outperforms baseline methods. Lastly, our sensitive analysis emphasizes the capability of the proposed framework to handle the long-tailed problem and mitigate the negative impact of noisy labels.
翻译:实际自然语言处理(NLP)任务通常与繁琐的标签有长长长的长度关系。这些问题对深神经网络(DNNS)等复杂模型的普及性和稳健性提出了挑战。一些常用的重新采样技术,例如过度采样或过低采样等,很容易导致过度配制。人们越来越普遍地学习数据权重,利用少量元数据。此外,最近的研究表明,自监督的训练前预选的优点,特别是代表性不足的数据。在这项工作中,我们提出了一个处理长尾标签和吵闹标签问题的总框架。模型以对比学习的方式适应问题领域。再加权模块是一个向前网络,学习明确的加权功能,并根据元数据调整重量。框架进一步调整了损失功能中术语的权重。我们的广泛实验表明,拟议的框架始终超越了基线方法。最后,我们敏感的分析强调,冷却标签框架能够缓解长期的冲击。