Text-to-SQL parsers are crucial in enabling non-experts to effortlessly query relational data. Training such parsers, by contrast, generally requires expertise in annotating natural language (NL) utterances with corresponding SQL queries. In this work, we propose a weak supervision approach for training text-to-SQL parsers. We take advantage of the recently proposed question meaning representation called QDMR, an intermediate between NL and formal query languages. Given questions, their QDMR structures (annotated by non-experts or automatically predicted), and the answers, we are able to automatically synthesize SQL queries that are used to train text-to-SQL models. We test our approach by experimenting on five benchmark datasets. Our results show that the weakly supervised models perform competitively with those trained on annotated NL-SQL data. Overall, we effectively train text-to-SQL parsers, while using zero SQL annotations.
翻译:文本到 SQL 分析器对于让非专家不费力地查询关系数据至关重要。 相反,培训这类分析器通常需要用相应的 SQL 查询说明自然语言(NL) 表达方式的专门知识。 在这项工作中,我们建议对培训文本到 SQL 分析器采取一种薄弱的监督方法。我们利用最近提出的问题含义说明法,即NL-SQL 和正式查询语言之间的中间语言QDMR。鉴于问题,他们的QDMR结构(由非专家附加说明或自动预测)和答案,我们能够自动合成用于培训文本到 SQL 模型的SQL 查询。我们通过试验五个基准数据集来测试我们的方法。我们的结果显示,薄弱的监督模型与那些受过附加说明的NL-SQL 数据培训的模型具有竞争力。总体而言,我们有效地培训了文本到 SQL 分析器,同时使用零 SQL 说明。