Most recent research on Text-to-SQL semantic parsing relies on either parser itself or simple heuristic based approach to understand natural language query (NLQ). When synthesizing a SQL query, there is no explicit semantic information of NLQ available to the parser which leads to undesirable generalization performance. In addition, without lexical-level fine-grained query understanding, linking between query and database can only rely on fuzzy string match which leads to suboptimal performance in real applications. In view of this, in this paper we present a general-purpose, modular neural semantic parsing framework that is based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural semantic parser (NSP). By jointly modeling query and database, NER model analyzes user intents and identifies entities in the query. NEL model links typed entities to schema and cell values in database. Parser model leverages available semantic information and linking results and synthesizes tree-structured SQL queries based on dynamically generated grammar. Experiments on SQUALL, a newly released semantic parsing dataset, show that we can achieve 56.8% execution accuracy on WikiTableQuestions (WTQ) test set, which outperforms the state-of-the-art model by 2.7%.
翻译:最近关于文本到 SQL 语义解析的研究大多依赖于分析器本身或简单的超光速法方法来理解自然语言查询(NLQ)。当合成 SQL 查询时,没有为分析器提供明确的 NLQ 语义信息,导致不可取的概括性工作。此外,如果不进行词汇级微微细查询理解,查询和数据库之间的连接只能依靠模糊的字符串匹配,从而导致在真实应用程序中出现亚优性化的性能。鉴于此,我们在此文件中提出了一个通用的模块型神经语义解析框架。当合成SQL 查询时,我们的框架由三个模块组成:名称实体识别器(NER)、神经实体链接器(NEL)和神经语系精细读性读取器(NSP)。通过联合模拟查询和数据库,NER模型分析用户意向,并通过查询中的实体将类型实体与模板和数据库中的单元格值链接。QAPARS-SL 模型在SMAL 数据库中将可获取的SMAL 模型和图像链接。