With the future striving toward data-centric decision-making, seamless access to databases is of utmost importance. There is extensive research on creating an efficient text-to-sql (TEXT2SQL) model to access data from the database. Using a Natural language is one of the best interfaces that can bridge the gap between the data and results by accessing the database efficiently, especially for non-technical users. It will open the doors and create tremendous interest among users who are well versed in technical skills or not very skilled in query languages. Even if numerous deep learning-based algorithms are proposed or studied, there still is very challenging to have a generic model to solve the data query issues using natural language in a real-work scenario. The reason is the use of different datasets in different studies, which comes with its limitations and assumptions. At the same time, we do lack a thorough understanding of these proposed models and their limitations with the specific dataset it is trained on. In this paper, we try to present a holistic overview of 24 recent neural network models studied in the last couple of years, including their architectures involving convolutional neural networks, recurrent neural networks, pointer networks, reinforcement learning, generative models, etc. We also give an overview of the 11 datasets that are widely used to train the models for TEXT2SQL technologies. We also discuss the future application possibilities of TEXT2SQL technologies for seamless data queries.
翻译:随着今后努力进行以数据为中心的决策,对数据库的无缝访问至关重要。在创建高效的文本到 sql(TREX2SQL)模式以从数据库获取数据方面,进行了广泛的研究。使用自然语言是最佳界面之一,通过高效访问数据库,特别是非技术用户,可以弥合数据与结果之间的差距。它将打开大门,在熟练掌握技术技能或不熟练使用查询语言的用户中产生极大兴趣。即使提出或研究了许多深层次的基于学习的算法,在实际工作情景中仍很难有一个通用模型,用自然语言解决数据查询问题。原因是在不同研究中使用不同的数据集,这些数据集随其局限性和假设而出现。与此同时,我们对这些拟议模型及其与所培训的具体数据集的局限性缺乏透彻的了解。在本文件中,我们试图对过去几年所研究的24个近期神经网络模型进行全面的概览,包括涉及同级神经网络的模型、反复使用的神经网络,以及用于不断更新的数据定位网络进行学习。