Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked language model (MLM). To further empower the PLMs with richer linguistic features, in this paper, we aim to propose a simple but effective way to learn linguistic features for pre-trained language models. We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original MLM pre-training task, using a linguistically-informed pre-training (LIP) strategy. We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements over various comparable baselines. Furthermore, we also conduct analytical experiments in various linguistic aspects, and the results prove that the design of LERT is valid and effective. Resources are available at https://github.com/ymcui/LERT
翻译:培训前语言模式已成为自然语言处理领域具有代表性的基础模式,大多数PLM在文字表面形式,如蒙面语言模式(MLM)上接受了语言上不可知的预先培训任务的培训。为了进一步增强语言特征较丰富的PLM人的能力,我们在本文件中提出为预先培训的语言模式学习语言特征的简单而有效的方法。我们建议LERT,即预先培训的语言模式,与最初的MLM培训前任务一起,利用语言知识化的培训前战略,进行语言培训前任务。我们就10项中国新语言模式任务进行了广泛的实验,实验结果显示,LERT可以在各种可比较的基线上带来重大改进。此外,我们还在各种语言方面进行了分析实验,结果证明LERT的设计是有效和有效的。我们可在https://github.com/ymcui/LERT查阅资源。