A natural language database interface (NLDB) can democratize data-driven insights for non-technical users. However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good usability in practice. This work presents Turing, a NLDB system toward bridging this gap. The cross-domain semantic parser of Turing with our novel value prediction method achieves $75.1\%$ execution accuracy, and $78.3\%$ top-5 beam execution accuracy on the Spider validation set. To benefit from the higher beam accuracy, we design an interactive system where the SQL hypotheses in the beam are explained step-by-step in natural language, with their differences highlighted. The user can then compare and judge the hypotheses to select which one reflects their intention if any. The English explanations of SQL queries in Turing are produced by our high-precision natural language generation system based on synchronous grammars.
翻译:自然语言数据库界面(NLDB)可以使非技术用户的数据驱动洞察力民主化。 但是,现有的文本到SQL语义分析器无法在交叉数据库设置中达到足够精确的跨数据库设置,从而在实践中能够很好地使用。 这项工作展示了图灵,这是争取弥合这一差距的全民联系统。 图灵的跨界语义分析器与我们的新颖价值预测方法实现了751美元的执行精确度,在蜘蛛验证集中实现了783美元顶部5比亚执行精确度。 为了得益于更高的波束精确度,我们设计了一个互动系统,用自然语言逐步解释SQL假设,并突出其差异。 用户然后可以比较并判断用来选择哪些假说是否反映其意图的假说。 图灵的SQL查询的英语解释是由我们基于同步语法的高精度自然语言生成系统制作的。