Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction. DRAGON outperforms existing LM and LM+KG models on diverse downstream tasks including question answering across general and biomedical domains, with +5% absolute gain on average. In particular, DRAGON achieves notable performance on complex reasoning about language and knowledge (+10% on questions involving long contexts or multi-step reasoning) and low-resource QA (+8% on OBQA and RiddleSense), and new state-of-the-art results on various BioNLP tasks. Our code and trained models are available at https://github.com/michiyasunaga/dragon.
翻译:在文本上对语言模型(LM)进行预先培训,以帮助下游的NLP任务。最近的工作显示,知识图表(KG)可以补充文本数据,提供结构化背景知识,为推理提供有用的参考工具。然而,这些工作没有做好准备,无法在规模上学习两种模式的深层融合,限制了完全合并文本和KG的潜力。在这里,我们提议DRAGON(深入双向双向语言知识模型),一种自我监督的预培训方式,即从文本和KG规模上对一个深入联合的语言知识基础模型(KG)进行深入的联合培训。具体地说,我们的模型将文本部分和相关KG子图作为投入和双向结合两种模式的信息组合成一对一。我们预设这一模型的方法是统一两个自我监督的推理任务、遮掩语言模型和KG链接预测。DRAGON(DRAON)超越了现有的LM和LM+KG模式,包括州际和生物医学领域的回答问题,平均+5 % Q-ADRAON(OA 低级和高层次的LVILA) 和低级逻辑问题。