In text-to-SQL task, seq-to-seq models often lead to sub-optimal performance due to limitations in their architecture. In this paper, we present a simple yet effective approach that adapts transformer-based seq-to-seq model to robust text-to-SQL generation. Instead of inducing constraint to decoder or reformat the task as slot-filling, we propose to train seq-to-seq model with Schema aware Denoising (SeaD), which consists of two denoising objectives that train model to either recover input or predict output from two novel erosion and shuffle noises. These denoising objectives acts as the auxiliary tasks for better modeling the structural data in S2S generation. In addition, we improve and propose a clause-sensitive execution guided (EG) decoding strategy to overcome the limitation of EG decoding for generative model. The experiments show that the proposed method improves the performance of seq-to-seq model in both schema linking and grammar correctness and establishes new state-of-the-art on WikiSQL benchmark. The results indicate that the capacity of vanilla seq-to-seq architecture for text-to-SQL may have been under-estimated.
翻译:在文本到 SQL 任务中, 后等值模型由于其结构的局限性,往往导致低于最佳的性能。 在本文中, 我们提出了一个简单而有效的方法, 使基于变压器的后等值模型适应强大的文本到 SQL 生成。 我们建议, 与Schema 了解Denoising (SeaD) 一起培训后等值模型, 其中包括两个分级目标, 即培训模型, 从两个新颖的侵蚀和抖动噪音中恢复输入或预测产出。 这些分级目标作为辅助任务, 更好地模拟S2S 生成的结构数据。 此外, 我们改进和提出一个对条款敏感的执行指导(EG) 解码战略, 以克服基因化模型对EG解码的限制。 实验显示, 拟议的方法改善了Schema 认识Denonoising (Semoment) 模式在Schema 连接和拼写校正, 并建立了WikSQL 的新的状态, 在WikSQL 基准中, 显示VAQRA 基本的变数结构中的能力。