Natural language to SQL (NL2SQL) aims to parse a natural language with a given database into a SQL query, which widely appears in practical Internet applications. Jointly encode database schema and question utterance is a difficult but important task in NL2SQL. One solution is to treat the input as a heterogeneous graph. However, it failed to learn good word representation in question utterance. Learning better word representation is important for constructing a well-designed NL2SQL system. To solve the challenging task, we present a Relation aware Semi-autogressive Semantic Parsing (\MODN) ~framework, which is more adaptable for NL2SQL. It first learns relation embedding over the schema entities and question words with predefined schema relations with ELECTRA and relation aware transformer layer as backbone. Then we decode the query SQL with a semi-autoregressive parser and predefined SQL syntax. From empirical results and case study, our model shows its effectiveness in learning better word representation in NL2SQL.
翻译:SQL (NL2SQL) 的自然语言旨在将带有特定数据库的自然语言解析成一个 SQL 查询,该查询在实际的互联网应用中广泛出现。在 NL2SQL 中,联合编码数据库的公式和问题表达式是一项困难但重要的任务。 一种解决办法是将输入作为异质图解处理。 但是, 它没有在相关表达式中学习好字表达式。 学习更好的字表达式对于构建一个设计完善的 NL2SQL 系统很重要。 为了解决这项具有挑战性的任务, 我们提出了一个具有半自动递性语义拼图解(\ MODN) ~ 框架工作, 它更适合 NL2SQL 。 它首先学习了在 schema 实体上的嵌入关系, 以及以预定义的 schema 关系和意识到变压层作为主干线的关系。 然后我们用一个半自动递增缩的分解 SQL 。 从实验结果和案例研究中, 我们的模型显示了它在学习更好的文字表达式在 NL2SQL 中的效果。