Controllable Data Augmentation for Context-Dependent Text-to-SQL - 专知论文

会员服务 ·

0

MoDELS · 数据增强 · SQL · 多样性 · 控制器 ·

2023 年 4 月 28 日

Controllable Data Augmentation for Context-Dependent Text-to-SQL

翻译：暂无翻译

Dingzirui Wang,Longxu Dou,Wanxiang Che

from arxiv, fix overlap

The limited scale of annotated data constraints existing context-dependent text-to-SQL models because of the complexity of labeling. The data augmentation method is a commonly used method to solve this problem. However, the data generated by current augmentation methods often lack diversity. In this paper, we introduce ConDA, which generates interactive questions and corresponding SQL results. We designed the SQL dialogue state to enhance the data diversity through the state transition. Meanwhile, we also present a filter method to ensure the data quality by a grounding model. Additionally, we utilize a grounding model to identify and filter low-quality questions that mismatch the state information. Experimental results on the SParC and CoSQL datasets show that ConDA boosts the baseline model to achieve an average improvement of $3.3\%$ on complex questions. Moreover, we analyze the augmented data, which reveals that the data generated by ConDA are of high quality in both SQL template hardness and types, turns, and question consistency.

翻译：暂无翻译

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

最小二乘有限元法的湍流大涡模拟及其并行计算

国家自然科学基金

0+阅读 · 2013年12月31日

面向云服务数据中心的OpenScale全光交换网络

国家自然科学基金

3+阅读 · 2013年12月31日

基于海洋要素场的涡旋过程数据建模与可视化

国家自然科学基金

2+阅读 · 2012年12月31日

容忍泄漏公钥加密的设计及安全性证明

国家自然科学基金

0+阅读 · 2012年12月31日

面向青藏高原的地表微波辐射建模及多年土壤水分反演

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向过程的海洋GIS时空表达与建模研究

国家自然科学基金

0+阅读 · 2009年12月31日

面向互联网舆情分析的文档自动摘要关键技术研究

国家自然科学基金

0+阅读 · 2008年12月31日

Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration

Arxiv

0+阅读 · 2023年6月13日

Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili

Arxiv

0+阅读 · 2023年6月12日

HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models

Arxiv

0+阅读 · 2023年6月12日

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Arxiv

0+阅读 · 2023年6月12日

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

Arxiv

0+阅读 · 2023年6月12日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年6月12日

Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment

Arxiv

0+阅读 · 2023年6月10日

Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

Arxiv

0+阅读 · 2023年6月8日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

12+阅读 · 2018年3月9日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《为多域数字战场变革装甲力量》报告

《多域训练：利用开放标准将太空与网络域同陆、海、空域训练相整合》报告

面向城市战：欧美徒步作战新装备

《人工智能增强监视分析：利用跨网络、陆地、空中及海上领域的威胁向量实时建模》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration

Arxiv

0+阅读 · 2023年6月13日

Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili

Arxiv

0+阅读 · 2023年6月12日

HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models

Arxiv

0+阅读 · 2023年6月12日

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Arxiv

0+阅读 · 2023年6月12日

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

Arxiv

0+阅读 · 2023年6月12日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年6月12日

Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment

Arxiv

0+阅读 · 2023年6月10日

Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

Arxiv

0+阅读 · 2023年6月8日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

12+阅读 · 2018年3月9日

相关基金

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

最小二乘有限元法的湍流大涡模拟及其并行计算

国家自然科学基金

0+阅读 · 2013年12月31日

面向云服务数据中心的OpenScale全光交换网络

国家自然科学基金

3+阅读 · 2013年12月31日

基于海洋要素场的涡旋过程数据建模与可视化

国家自然科学基金

2+阅读 · 2012年12月31日

容忍泄漏公钥加密的设计及安全性证明

国家自然科学基金

0+阅读 · 2012年12月31日

面向青藏高原的地表微波辐射建模及多年土壤水分反演

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向过程的海洋GIS时空表达与建模研究

国家自然科学基金

0+阅读 · 2009年12月31日

面向互联网舆情分析的文档自动摘要关键技术研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员