Textual entailment is a fundamental task in natural language processing. Most approaches for solving the problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageR- ank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture KG structure. Our technique extends the capability of text models exploiting structural and semantic information found in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps improve prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.
翻译:文本包含是自然语言处理中的一项基本任务。 大部分解决问题的方法只使用培训数据中存在的文本内容。 少数方法显示,外部知识来源的信息,如知识图(KGs),除了文字内容之外,还可以通过提供对任务至关重要的背景知识来增加价值。 但是,拟议的模型没有充分利用通常大而吵的KGs中的信息,也没有清楚地说明如何有效地将它编码成有用的要求。 我们提出了一个方法,即(1) 使用个性化的 PageR- ank 来补充来自 KGs 的信息基于文本的包含模型,以生成噪音减少的背景子集,(2) 将这些子集编码,使用图形革命网络来捕捉KGs的结构。我们的技术扩大了利用KGs中的结构和语义信息的文本模型的能力。 我们评估了我们关于多个文本包含数据集的方法,并表明外部知识的使用有助于提高预测的准确性。这在具有挑战性的断裂式NLI数据集中特别明显,我们看到5-20 %的模型在多个文本基础上的绝对改进。