In this paper, we are interested in developing semantic parsers which understand natural language questions embedded in a conversation with a user and ground them to formal queries over definitions in a general purpose knowledge graph (KG) with very large vocabularies (covering thousands of concept names and relations, and millions of entities). To this end, we develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof. We present two different semantic parsing approaches and highlight the challenges of the task: dealing with large vocabularies, modelling conversation context, predicting queries with multiple entities, and generalising to new questions at test time. We hope our dataset will serve as useful testbed for the development of conversational semantic parsers. Our dataset and models are released at https://github.com/EdinburghNLP/SPICE.
翻译:在本文中,我们有兴趣开发一个语义解析器,以理解与用户对话中所含的自然语言问题,并将之用于在通用知识图(KG)中就定义进行正式查询,该图包含非常庞大的词汇(包括数千个概念名称和关系,以及数百万个实体 ) 。 为此,我们开发了一个数据集,用Sparql 剖析器对用户问题进行附加说明,系统答复与执行结果相对应。我们提出了两种不同的语义解析方法,并突出任务的挑战:处理大型词汇、模拟对话背景、预测与多个实体的查询,以及在测试时对新问题进行概括。我们希望我们的数据集将成为开发对话语义解析器的有用测试台。我们的数据集和模型将在https://github.com/EdinburghNLP/SPICE发布。