Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the vast search space, existing work usually adopts a two-stage approach: it firstly retrieves a relatively small subgraph related to the question and then performs the reasoning on the subgraph to accurately find the answer entities. Although these two stages are highly related, previous work employs very different technical solutions for developing the retrieval and reasoning models, neglecting their relatedness in task essence. In this paper, we propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning. For model architecture, UniKGQA consists of a semantic matching module based on a pre-trained language model~(PLM) for question-relation semantic matching, and a matching information propagation module to propagate the matching information along the edges on KGs. For parameter learning, we design a shared pre-training task based on question-relation matching for both retrieval and reasoning models, and then propose retrieval- and reasoning-oriented fine-tuning strategies. Compared with previous studies, our approach is more unified, tightly relating the retrieval and reasoning stages. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our method on the multi-hop KGQA task. Our codes and data are publicly available at https://github.com/RUCAIBox/UniKGQA.
翻译:有关知识图~( KGQA) 的多点问题解答( KGQA ) 的多点问题解答( KGQA ), 目的是找到在大型知识图( KGG) 上与自然语言问题中提到的主题实体多重跳跃的答案实体。 为了应对巨大的搜索空间, 现有工作通常采取两阶段的方法: 首先, 它会检索一个相对较小的有关这个问题的子集, 然后对子集进行推理, 以准确找到答案实体。 虽然这两个阶段密切相关, 先前的工作在开发检索和推理模型时使用了非常不同的技术解决方案, 忽略了它们的任务实质的关联性。 在本文中, 我们建议 UniKGQQQA, 这是多点KGQQA 任务的新颖方法, 通过统一模型架构和参数学习的检索和推理, UNGCA 由基于预先培训语言基准的语系匹配模块 ~ ( PLM ) 来精确查找答案, 以及匹配信息传播模块, 在 KGBS 的边缘上传播匹配信息。 关于参数学习, 我们设计一个共同的CEBA 之前的更精确的检索和更精确的方法 。