RESDSQL：拆分文本到SQL的模式链接和骨架解析 (RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL) - 专知论文

会员服务 ·

0

骨架 · SQL · 网络爬虫 · seq2seq · 解码 ·

2023 年 4 月 10 日

RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL

翻译：RESDSQL：拆分文本到SQL的模式链接和骨架解析

Haoyang Li,Jing Zhang,Cuiping Li,Hong Chen

from arxiv, Accepted to AAAI 2023 main conference (oral)

One of the recent best attempts at Text-to-SQL is the pre-trained language model. Due to the structural property of the SQL queries, the seq2seq model takes the responsibility of parsing both the schema items (i.e., tables and columns) and the skeleton (i.e., SQL keywords). Such coupled targets increase the difficulty of parsing the correct SQL queries especially when they involve many schema items and logic operators. This paper proposes a ranking-enhanced encoding and skeleton-aware decoding framework to decouple the schema linking and the skeleton parsing. Specifically, for a seq2seq encoder-decode model, its encoder is injected by the most relevant schema items instead of the whole unordered ones, which could alleviate the schema linking effort during SQL parsing, and its decoder first generates the skeleton and then the actual SQL query, which could implicitly constrain the SQL parsing. We evaluate our proposed framework on Spider and its three robustness variants: Spider-DK, Spider-Syn, and Spider-Realistic. The experimental results show that our framework delivers promising performance and robustness. Our code is available at https://github.com/RUCKBReasoning/RESDSQL.

翻译：最近文本到SQL领域最佳工作之一是预训练语言模型。由于SQL查询的结构属性，seq2seq模型负责解析架构项（即表和列）和骨架（即SQL关键字）。这些耦合目标增加了解析正确SQL查询的难度，特别是当它们涉及到许多架构项和逻辑运算符时。本文提出了一个基于排名的编码和骨架感知解码框架，用于拆分模式链接和骨架解析。具体而言，对于seq2seq编码器-解码器模型，其编码器注入最相关的模式项而非整个无序模式项，这可以减轻SQL分析过程中的模式链接工作，并且其解码器首先生成骨架，然后再生成实际的SQL查询，这可以隐含地约束SQL解析。我们在Spider以及其三个稳健性变体（Spider-DK，Spider-Syn和Spider-Realistic）上评估了我们提出的框架。实验结果表明，我们的框架提供了有希望的性能和稳健性。我们的代码可以通过 https://github.com/RUCKBReasoning/RESDSQL 获取。

1

相关内容

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【SIGIR2021】基于知识图谱的增强连贯性文本规划的评论生成

【SIGIR2021】基于知识图谱的增强连贯性文本规划的评论生成

专知会员服务

31+阅读 · 2021年5月14日

【KDD 2020】基于互信息最大化的多知识图谱语义融合

【KDD 2020】基于互信息最大化的多知识图谱语义融合

专知会员服务

43+阅读 · 2020年9月7日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

【资源】文本风格迁移相关资源汇总

【资源】文本风格迁移相关资源汇总

专知

13+阅读 · 2020年7月11日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

专知

12+阅读 · 2018年5月9日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

KBQA: 基于开放域知识库上的QA系统 | 每周一起读

KBQA: 基于开放域知识库上的QA系统 | 每周一起读

PaperWeekly

15+阅读 · 2017年7月2日

基于IFC的建筑信息模型(BIM)语义检索技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向统计机器翻译的同步短语树结构归约机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

微合金化Al-Cu-Sc合金的多重强化和多尺度断裂行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于高阶连续理论的细胞骨架中微管力学特性的理论分析和数值模拟

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于串线的3-维蛋白结构预测的新技术研究

国家自然科学基金

0+阅读 · 2008年12月31日

GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation

Arxiv

0+阅读 · 2023年5月26日

Hidden Schema Networks

Arxiv

0+阅读 · 2023年5月26日

RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank

Arxiv

0+阅读 · 2023年5月26日

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

Arxiv

0+阅读 · 2023年5月25日

Masked and Permuted Implicit Context Learning for Scene Text Recognition

Arxiv

0+阅读 · 2023年5月25日

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

Arxiv

0+阅读 · 2023年5月25日

Pay More Attention to Relation Exploration for Knowledge Base Question Answering

Arxiv

0+阅读 · 2023年5月25日

Gradient-Boosted Decision Tree for Listwise Context Model in Multimodal Review Helpfulness Prediction

Arxiv

0+阅读 · 2023年5月25日

Bert4CMR: Cross-Market Recommendation with Bidirectional Encoder Representations from Transformer

Arxiv

0+阅读 · 2023年5月24日

Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping

Arxiv

0+阅读 · 2023年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【SIGIR2021】基于知识图谱的增强连贯性文本规划的评论生成

【SIGIR2021】基于知识图谱的增强连贯性文本规划的评论生成

专知会员服务

31+阅读 · 2021年5月14日

【KDD 2020】基于互信息最大化的多知识图谱语义融合

【KDD 2020】基于互信息最大化的多知识图谱语义融合

专知会员服务

43+阅读 · 2020年9月7日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

【资源】文本风格迁移相关资源汇总

【资源】文本风格迁移相关资源汇总

专知

13+阅读 · 2020年7月11日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

【论文推荐】最新六篇自动问答相关论文—无监督迁移学习、综述、生成式问答、QDEE、可扩展文档理解

专知

12+阅读 · 2018年5月9日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

KBQA: 基于开放域知识库上的QA系统 | 每周一起读

KBQA: 基于开放域知识库上的QA系统 | 每周一起读

PaperWeekly

15+阅读 · 2017年7月2日

相关论文

GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation

Arxiv

0+阅读 · 2023年5月26日

Hidden Schema Networks

Arxiv

0+阅读 · 2023年5月26日

RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank

Arxiv

0+阅读 · 2023年5月26日

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

Arxiv

0+阅读 · 2023年5月25日

Masked and Permuted Implicit Context Learning for Scene Text Recognition

Arxiv

0+阅读 · 2023年5月25日

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

Arxiv

0+阅读 · 2023年5月25日

Pay More Attention to Relation Exploration for Knowledge Base Question Answering

Arxiv

0+阅读 · 2023年5月25日

Gradient-Boosted Decision Tree for Listwise Context Model in Multimodal Review Helpfulness Prediction

Arxiv

0+阅读 · 2023年5月25日

Bert4CMR: Cross-Market Recommendation with Bidirectional Encoder Representations from Transformer

Arxiv

0+阅读 · 2023年5月24日

Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping

Arxiv

0+阅读 · 2023年5月24日

相关基金

基于IFC的建筑信息模型(BIM)语义检索技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向统计机器翻译的同步短语树结构归约机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

微合金化Al-Cu-Sc合金的多重强化和多尺度断裂行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于高阶连续理论的细胞骨架中微管力学特性的理论分析和数值模拟

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于串线的3-维蛋白结构预测的新技术研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员