External knowledge is often useful for natural language understanding tasks. We introduce a contextual text representation model called Conceptual-Contextual (CC) embeddings, which incorporates structured knowledge into text representations. Unlike entity embedding methods, our approach encodes a knowledge graph into a context model. CC embeddings can be easily reused for a wide range of tasks just like pre-trained language models. Our model effectively encodes the huge UMLS database by leveraging semantic generalizability. Experiments on electronic health records (EHRs) and medical text processing benchmarks showed our model gives a major boost to the performance of supervised medical NLP tasks.
翻译:外部知识往往对自然语言理解任务有用。 我们引入了一种背景文本代表模型,称为概念-理论嵌入(CC),将结构化知识纳入文字表述。 与实体嵌入方法不同,我们的方法将知识图编码成一个背景模型。 CC嵌入可以很容易地被重新用于广泛的任务,就像预先培训的语言模型一样。 我们的模式通过利用语义通用性有效地编码了巨大的UMLS数据库。 对电子健康记录(EHRs)和医疗文本处理基准的实验显示,我们的模型极大地推动了受监督的医学国家实验室任务的执行。