Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals promising results in twelve NLP tasks. Especially in domain-specific tasks (including finance, law, and medicine), K-BERT significantly outperforms BERT, which demonstrates that K-BERT is an excellent choice for solving the knowledge-driven problems that require experts.
翻译:受过培训的语文代表模式,如BERT,从大型公司中获取一般语言代表,但缺乏具体领域的知识。在阅读域名文本时,专家用相关知识作出推论。为了实现这一能力,我们提议了一种具有知识化语言代表模式(K-BERT)的机器,其中将三倍作为域名输入句子。然而,过多的知识化可能会使句子偏离其正确含义,即所谓的知识噪音问题。要克服 KN,K-BERT引入软定位和可见矩阵以限制知识的影响。K-BERT可以很容易地将域名知识注入模型,而无需经过事先培训的KG,因为它能够从事先培训的BERT中装载模型参数。我们的调查揭示出在12项NLP任务中的可喜结果。特别是在具体领域的任务(包括金融、法律和医学)中,K-BERT明显超越了BERT, 这表明K-BERT是解决专家需要的知识驱动问题的极佳的选择。