Most existing pre-trained language representation models (PLMs) are sub-optimal in sentiment analysis tasks, as they capture the sentiment information from word-level while under-considering sentence-level information. In this paper, we propose SentiWSP, a novel Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks. The word level pre-training task detects replaced sentiment words, via a generator-discriminator framework, to enhance the PLM's knowledge about sentiment words. The sentence level pre-training task further strengthens the discriminator via a contrastive learning framework, with similar sentences as negative samples, to encode sentiments in a sentence. Extensive experimental results show that SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks. We have made our code and model publicly available at https://github.com/XMUDM/SentiWSP.
翻译:多数经过培训的语文代表模式(PLM)在情绪分析任务方面并不理想,因为这些模式从字层收集情绪信息,而没有充分考虑判刑一级的信息。在本文件中,我们提议SentiWSP,这是一个新的具有感官意识的预先培训语言模式,结合了Word一级和判刑一级培训前任务。字级培训前任务通过一个发电机-差异性框架检测替代情绪单词,以提高PLM对情绪单词的了解。判刑前培训任务通过对比性学习框架进一步加强歧视者,其句子与负面样本相似,将情绪纳入一个句子。广泛的实验结果显示SentiWSP在各种判决一级和层面情绪分类基准上取得了新的最新业绩。我们已经在https://github.com/XMMDM/SentiWSP上公布了我们的代码和模式。