Relational structures such as schema linking and schema encoding have been validated as a key component to qualitatively translating natural language into SQL queries. However, introducing these structural relations comes with prices: they often result in a specialized model structure, which largely prohibits the use of large pretrained models in text-to-SQL. To address this problem, we propose RASAT: a Transformer seq2seq architecture augmented with relation-aware self-attention that could leverage a variety of relational structures while at the meantime being able to effectively inherit the pretrained parameters from the T5 model. Our model is able to incorporate almost all types of existing relations in the literature, and in addition, we propose to introduce co-reference relations for the multi-turn scenario. Experimental results on three widely used text-to-SQL datasets, covering both single-turn and multi-turn scenarios, have shown that RASAT could achieve competitive results in all three benchmarks, achieving state-of-the-art performance in execution accuracy (80.5\% EX on Spider, 53.1\% IEX on SParC, and 37.5\% IEX on CoSQL).
翻译:将自然语言质量化为SQL查询的关键组成部分是,将自然语言质量化为SQL查询,但是,引入这些结构关系时价格随价格而来:它们往往导致一个专门模型结构,主要禁止在文本到SQL中使用大型预先培训模型。 为了解决这个问题,我们提议RASAT:一个变异后继2seq结构,辅之以具有关系意识的自我关注,能够利用各种关系结构,同时能够有效继承T5模型的预先培训参数。我们的模型能够将几乎所有类型的现有关系纳入文献中,此外,我们提议为多转假设采用共同参照关系。三个广泛使用的文本到SQL数据集的实验结果显示,RASAT能够在所有三个基准中取得竞争性结果,实现执行精确度(蜘蛛(80.5 ⁇ EX)、SParC(53.1 ⁇ IEX)和COSQL(37.5) EX)。