Recent work on enhancing BERT-based language representation models with knowledge graphs (KGs) and knowledge bases (KBs) has yielded promising results on multiple NLP tasks. State-of-the-art approaches typically integrate the original input sentences with KG triples and feed the combined representation into a BERT model. However, as the sequence length of a BERT model is limited, such a framework supports little knowledge other than the original input sentences and is thus forced to discard some knowledge. This problem is especially severe for downstream tasks for which the input is a long paragraph or even a document, such as QA or reading comprehension tasks. We address this problem with Roof-Transformer, a model with two underlying BERTs and a fusion layer on top. One underlying BERT encodes the knowledge resources and the other one encodes the original input sentences, and the fusion layer integrates the two resultant encodings. Experimental results on a QA task and the GLUE benchmark attest the effectiveness of the proposed model.
翻译:最近,利用知识图表和知识基础(KGs)加强基于BERT的语言表述模型的工作在多项NLP任务方面取得了有希望的成果。 最先进的方法通常将原始输入句与KG三重结合,并将合并表示制纳入BERT模式。然而,由于BERT模式的顺序长度有限,这种框架除了原始输入句外,几乎没有什么知识,因此被迫放弃某些知识。对于下游任务来说,这一问题特别严重,因为投入是一个长段落,甚至是一个文件,例如QA或阅读理解任务。我们用“Roof-Transer”解决了这一问题,这是一个包含两个基本BERT和顶部一个聚合层的模型。一个基本的BERT编码了知识资源,另一个编码了原始输入句,而结合了两个结果编码。QA任务的实验结果和GLUE基准证明了拟议模型的有效性。