Knowledge base question answering (KBQA) aims to answer a question over a knowledge base (KB). Early studies mainly focused on answering simple questions over KBs and achieved great success. However, their performance on complex questions is still far from satisfactory. Therefore, in recent years, researchers propose a large number of novel methods, which looked into the challenges of answering complex questions. In this survey, we review recent advances on KBQA with the focus on solving complex questions, which usually contain multiple subjects, express compound relations, or involve numerical operations. In detail, we begin with introducing the complex KBQA task and relevant background. Then, we describe benchmark datasets for complex KBQA task and introduce the construction process of these datasets. Next, we present two mainstream categories of methods for complex KBQA, namely semantic parsing-based (SP-based) methods and information retrieval-based (IR-based) methods. Specifically, we illustrate their procedures with flow designs and discuss their major differences and similarities. After that, we summarize the challenges that these two categories of methods encounter when answering complex questions, and explicate advanced solutions and techniques used in existing work. Finally, we conclude and discuss several promising directions related to complex KBQA for future research.
翻译:早期研究主要侧重于回答关于知识库的简单问题,并取得了巨大成功。然而,他们在复杂问题上的表现仍然远远不能令人满意。因此,近年来,研究人员提出了大量新颖方法,调查了回答复杂问题的挑战。在这次调查中,我们审查了KBQA的最新进展,重点是解决通常包含多个主题、明确复合关系或涉及数字操作的复杂问题。我们首先介绍复杂的KBQA任务和相关背景。然后,我们描述复杂的KBQA任务的基准数据集,并介绍这些数据集的构建过程。接下来,我们提出了复杂的KBQA方法的两个主流类别,即基于语法的(基于SP)方法和基于信息检索(基于IR)的方法。具体地说,我们用流动设计来说明其程序,并讨论它们的主要差异和相似之处。我们总结了这两类方法在回答复杂问题时遇到的挑战,然后,我们为目前工作中使用的一些复杂的解决方案和技巧,我们最后讨论了与KB有关的先进方法。