Pre-trained language models (PLMs) have shown their effectiveness in multiple scenarios. However, KBQA remains challenging, especially regarding coverage and generalization settings. This is due to two main factors: i) understanding the semantics of both questions and relevant knowledge from the KB; ii) generating executable logical forms with both semantic and syntactic correctness. In this paper, we present a new KBQA model, TIARA, which addresses those issues by applying multi-grained retrieval to help the PLM focus on the most relevant KB contexts, viz., entities, exemplary logical forms, and schema items. Moreover, constrained decoding is used to control the output space and reduce generation errors. Experiments over important benchmarks demonstrate the effectiveness of our approach. TIARA outperforms previous SOTA, including those using PLMs or oracle entity annotations, by at least 4.1 and 1.1 F1 points on GrailQA and WebQuestionsSP, respectively.
翻译:预先培训的语言模型(PLMs)在多种情况下都显示了其有效性,然而,KBQA仍然具有挑战性,特别是在覆盖面和一般化设置方面,这主要归因于两个主要因素:(一) 理解两个问题的语义和KB的相关知识;(二) 产生具有语义和综合正确性的可执行逻辑形式;在本文件中,我们提出了一个新的KBQA模型(TIARA),即TIARA,通过应用多级检索来解决这些问题,帮助PLM关注最相关的KB环境,即实体、典型逻辑形式和Schema项目。此外,限制解码用于控制输出空间和减少生成错误。关于重要基准的实验显示了我们的方法的有效性。TIARA超越了以前的SOTA,包括使用PLMs或Ocle实体说明的SOTA,分别在GrailQA和WebQuestionSP上至少用4.1和1.1F1分。