Complex Query Answering (CQA) is an important reasoning task on knowledge graphs. Current CQA learning models have been shown to be able to generalize from atomic operators to more complex formulas, which can be regarded as the combinatorial generalizability. In this paper, we present EFO-1-QA, a new dataset to benchmark the combinatorial generalizability of CQA models by including 301 different queries types, which is 20 times larger than existing datasets. Besides, our work, for the first time, provides a benchmark to evaluate and analyze the impact of different operators and normal forms by using (a) 7 choices of the operator systems and (b) 9 forms of complex queries. Specifically, we provide the detailed study of the combinatorial generalizability of two commonly used operators, i.e., projection and intersection, and justify the impact of the forms of queries given the canonical choice of operators. Our code and data can provide an effective pipeline to benchmark CQA models.
翻译:复杂的查询回答(CQA)是知识图解的一个重要推理任务。当前的CQA学习模型已证明能够从原子操作员向更复杂的公式进行概括化,这可被视为组合式一般性。我们在本文件中介绍了EFO-1-QA,这是一套新的数据集,通过包括301个不同查询类型来衡量组合式CQA模型的可概括性,这比现有数据集大20倍。此外,我们的工作首次提供了一个基准,通过使用(a) 7个操作员系统选择和(b) 9个复杂查询形式来评估和分析不同操作员和正常形式的影响。具体地说,我们提供了对两个常用操作员(即预测和交叉)的组合式一般性的详细研究,并证明根据操作员的直截性选择,查询形式的影响。我们的编码和数据可以提供一种有效的管道,用以衡量CQA模型的基准。