Conversational text-to-SQL is designed to translate multi-turn natural language questions into their corresponding SQL queries. Most state-of-the-art conversational text- to-SQL methods are incompatible with generative pre-trained language models (PLMs), such as T5. In this paper, we present a two-stage unified MultI-task Generation frAmework (MIGA) that leverages PLMs' ability to tackle conversational text-to-SQL. In the pre-training stage, MIGA first decomposes the main task into several related sub-tasks and then unifies them into the same sequence-to-sequence (Seq2Seq) paradigm with task-specific natural language prompts to boost the main task from multi-task training. Later in the fine-tuning stage, we propose four SQL perturbations to alleviate the error propagation problem. MIGA tends to achieve state-of-the-art performance on two benchmarks (SparC and CoSQL). We also provide extensive analyses and discussions to shed light on some new perspectives for conversational text-to-SQL.
翻译:将多种自然语言问题转化为相应的 SQL 查询。大多数最先进的对话文本到 SQL 方法与T5 等经过训练的基因化先变语言模型(PLM)不相容。在本文件中,我们提出了一个两阶段统一的MutI-task Game FrAmework(MIGA),利用PL处理对话文本到SQL 的能力。在培训前阶段,MIGA首先将主要任务分解为若干相关的子任务,然后将其合并为相同的序列到序列模式(Seq2Seqeq),以特定任务的自然语言促进从多任务培训中推动主要任务。在微调阶段后,我们提出四份SQL访问,以缓解错误传播问题。MIGA往往在两个基准(SparC和COSQL)上实现最新业绩。我们还提供广泛的分析和讨论,以阐明对话文本到C的新观点。