RAT-SQL: 文本到 SQL 剖析器的关系- Aware Schema 编码和链接 (RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers)

When translating natural language questions into SQL queries to answer questions from a database, contemporary semantic parsing models struggle to generalize to unseen database schemas. The generalization challenge lies in (a) encoding the database relations in an accessible way for the semantic parser, and (b) modeling alignment between database columns and their mentions in a given query. We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder. On the challenging Spider dataset this framework boosts the exact match accuracy to 57.2%, surpassing its best counterparts by 8.7% absolute improvement. Further augmented with BERT, it achieves the new state-of-the-art performance of 65.6% on the Spider leaderboard. In addition, we observe qualitative improvements in the model's understanding of schema linking and alignment. Our implementation will be open-sourced at https://github.com/Microsoft/rat-sql.

翻译：将自然语言问题转换成 SQL 查询时, 当代语义解析模型很难将普通化为无形数据库图案。总体化的挑战在于 (a) 为语义解析器以无障碍的方式将数据库关系编码成可访问的方式为语义解析器编码, (b) 将数据库列及其在给定查询中提及的内容建模一致。我们根据自我注意关系机制提出了一个统一框架, 以解决文本到 SQL 编码、 schema 连接和特征在文本到 SQL 编码中的代表性。在具有挑战性的蜘蛛数据集中, 这个框架将精确匹配性提高到57.2%, 超过最佳对应方的8.7% 绝对改进率。通过 BERT 进一步扩展, 它在蜘蛛头板上达到65.6%的新的最新性能。此外, 我们观察模型对 schem 连接和校正校准的理解质量上的改进。我们的实施将在 https://github. com/ Microsoft/ rat- sql 上公开。

相关内容

网络爬虫

关注 13

网络爬虫（又被称为网页蜘蛛，网络机器人，在FOAF社区中间，更经常被称为网页追逐者），是一种按照一定的规则，自动的抓取万维网信息的程序或者脚本，已被广泛应用于互联网领域。搜索引擎使用网络爬虫抓取Web网页、文档甚至图片、音频、视频等资源，通过相应的索引技术组织这些信息，提供给搜索用户进行查询。网络爬虫也为中小站点的推广提供了有效的途径。

面向知识图谱的信息抽取

专知会员服务

200+阅读 · 2020年10月14日