Text-to-SQL parsing is an essential and challenging task. The goal of text-to-SQL parsing is to convert a natural language (NL) question to its corresponding structured query language (SQL) based on the evidences provided by relational databases. Early text-to-SQL parsing systems from the database community achieved a noticeable progress with the cost of heavy human engineering and user interactions with the systems. In recent years, deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output SQL query. Subsequently, the large pre-trained language models have taken the state-of-the-art of the text-to-SQL parsing task to a new level. In this survey, we present a comprehensive review on deep learning approaches for text-to-SQL parsing. First, we introduce the text-to-SQL parsing corpora which can be categorized as single-turn and multi-turn. Second, we provide a systematical overview of pre-trained language models and existing methods for text-to-SQL parsing. Third, we present readers with the challenges faced by text-to-SQL parsing and explore some potential future directions in this field.
翻译:文本到 SQL 解析是一项至关重要且具有挑战性的任务。 文本到 SQL 解析的目标是,根据关系数据库提供的证据,将自然语言(NL)问题转换成相应的结构化查询语言(SQL ) 。 数据库社区的早期文本到 SQL 解析系统取得了显著进展,付出了沉重的人文工程和用户与系统互动的成本。 近年来,深层神经网络通过神经生成模型大大推进了这项任务,这些模型自动学习从输入NL 问题到输出SQL 查询的映射功能。 随后,大型预先培训的语言模型将文本到 SQL 解析任务提高到了一个新的水平。 在本次调查中,我们全面审视了对文本到SQL 的深层次学习方法。 首先,我们引入了文本到SQL 的分解功能,可以归类为单向和多向。 其次,我们提供了对当前语言模型和现有文本探索方法的系统化概览。