Knowledge-aware question answering (KAQA) requires the model to answer questions over a knowledge base, which is essential for both open-domain QA and domain-specific QA, especially when language models alone cannot provide all the knowledge needed. Despite the promising result of recent KAQA systems which tend to integrate linguistic knowledge from pre-trained language models (PLM) and factual knowledge from knowledge graphs (KG) to answer complex questions, a bottleneck exists in effectively fusing the representations from PLMs and KGs because of (i) the semantic and distributional gaps between them, and (ii) the difficulties in joint reasoning over the provided knowledge from both modalities. To address the above two problems, we propose a Fine-grained Two-stage training framework (FiTs) to boost the KAQA system performance: The first stage aims at aligning representations from the PLM and the KG, thus bridging the modality gaps between them, named knowledge adaptive post-training. The second stage, called knowledge-aware fine-tuning, aims to improve the model's joint reasoning ability based on the aligned representations. In detail, we fine-tune the post-trained model via two auxiliary self-supervised tasks in addition to the QA supervision. Extensive experiments demonstrate that our approach achieves state-of-the-art performance on three benchmarks in the commonsense reasoning (i.e., CommonsenseQA, OpenbookQA) and medical question answering (i.e., MedQA-USMILE) domains.
翻译:知识意识问题解答(KAQA)要求该模型回答知识库问题,而知识库对于开放域名质量评估以及特定域名质量评估都至关重要,特别是语言模型无法提供所需的全部知识。尽管最近的KAQA系统往往将预先培训语言模型的语言知识和知识图(KG)中的事实知识结合起来,以回答复杂问题,但是在有效发挥PLMs和KGs代表作用方面存在瓶颈,因为(一) 它们之间的语义和分布差距,以及(二) 对这两种模式提供的知识的联合推理困难。为了解决上述两个问题,我们提议一个精细的两阶段培训框架(FITs),以提高KAQA系统的业绩:第一阶段旨在协调PLM和KG的表述,从而缩小它们之间的模式差距,称为知识适应后培训。第二阶段,即知识意识调整,目的是提高模型在统一模型、共同版本中提供的联合推理能力。我们通过测试三阶段的自我分析,通过自我分析,在三个阶段,我们通过自我分析的自我分析,在A级标准中,通过自我分析的自我分析,在三个阶段,在A-Q。