Sentence semantic understanding is a key topic in the field of natural language processing. Recently, contextualized word representations derived from pre-trained language models such as ELMO and BERT have shown significant improvements for a wide range of semantic tasks, e.g. question answering, text classification and sentiment analysis. However, how to add external knowledge to further improve the semantic modeling capability of model is worth probing. In this paper, we propose a novel approach to combining syntax information with a pre-trained language model. In order to evaluate the effect of the pre-training model, first, we introduce RNN-based and Transformer-based pre-trained language models; secondly, to better integrate external knowledge, such as syntactic information integrate with the pre-training model, we propose a dependency syntax expansion (DSE) model. For evaluation, we have selected two subtasks: sentence completion task and biological relation extraction task. The experimental results show that our model achieves 91.2\% accuracy, outperforming the baseline model by 37.8\% on sentence completion task. And it also gets competitive performance by 75.1\% $F_{1}$ score on relation extraction task.
翻译:最近,来自ELMO和BERT等预先培训语言模型的背景化词表表明,在一系列广泛的语义任务方面,例如问答、文本分类和情绪分析等,都显示出了重大改进。然而,如何增加外部知识来进一步提高模型的语义建模能力值得考察。在本文件中,我们提出了将语法信息与预先培训的语言模型相结合的新办法。为了评估培训前模式的效果,首先,我们引入了基于RNN和基于变异器的预先培训语言模型;其次,为了更好地整合外部知识,例如合成信息与培训前模式相结合,我们建议了一个依赖性语法扩展模型。我们为评估选择了两个子任务:句尾任务和生物关系提取任务。实验结果表明,我们的模式达到了91.2 ⁇ 准确度,在完成判决后比基准模型高出37.8 ⁇ 。此外,它还获得了75.1 $ $F ⁇ 1}的竞争性业绩。