将文字语义学用于中国预培训模式的丰富代表性 (Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models)

Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-to-character alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.

翻译：大多数经过培训的中国模型都采用字符作为下游任务的基本单位。但是, 这些模型忽略了文字中的信息, 从而导致一些重要语义的丧失。在本文中, 我们提出一种新的方法来利用文字结构, 并将词汇语义纳入经过培训的模型的性格表示中。具体地说, 我们根据相似的份量, 将单词嵌入其内部字符的嵌入。为加强字义边界信息, 我们在一个单词中将内部字符的表达方式混合在一起。之后, 我们运用一个字对字的调调和关注机制, 来通过遮盖不重要的文字来强调重要字符。此外, 为了减少因文字分割造成的错误传播, 我们提出了一种混合方法, 将不同符号的分解结果组合在一起。实验结果显示, 我们的方法在经过培训的基本模型BERT、 BERT-wmm 和 ENIENIE 上取得了优异的成绩: 情绪分类、句配对、自然语言精度和机器阅读理解。我们进一步分析, 以证明模型每个组成部分的有效性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/