提示DA: 向以快速为基础的少见学生提供标签制导数据增强 (PromptDA: Label-guided Data Augmentation for Prompt-based Few Shot Learners) - 专知论文

会员服务 ·

0

数据增强 · 小样本学习 · tuning · 标注 · 学习器 ·

2022 年 9 月 13 日

PromptDA: Label-guided Data Augmentation for Prompt-based Few Shot Learners

翻译：提示DA: 向以快速为基础的少见学生提供标签制导数据增强

Canyu Chen,Kai Shu

Recent advances in large pre-trained language models (PLMs) lead to impressive gains on natural language understanding (NLU) tasks with task-specific fine-tuning. However, direct fine-tuning PLMs heavily relies on a large amount of labeled instances, which are usually hard to obtain. Prompt-based tuning on PLMs has proven valuable for various few-shot tasks. Existing works studying prompt-based tuning for few-shot NLU tasks mainly focus on deriving proper label words with a verbalizer or generating prompt templates for eliciting semantics from PLMs. In addition, conventional data augmentation methods have also been verified useful for few-shot tasks. However, currently there are few data augmentation methods designed for the prompt-based tuning paradigm. Therefore, we study a new problem of data augmentation for prompt-based few shot learners. Since the label semantics are essential in prompt-based tuning, we propose a novel label-guided data augmentation method PromptDA which exploits the enriched label semantic information for data augmentation. Extensive experiment results on few-shot text classification tasks show that our proposed framework achieves superior performance by effectively leveraging label semantics and data augmentation for natural language understanding.

翻译：在经过培训的大型语言模型(PLM)方面最近的进展导致自然语言理解任务(NLU)的显著进展,这些任务有特定任务的微调。然而,直接微调PLM严重依赖大量标签式的事例,通常很难获得。基于快速的对PLM的调试已证明对各种微小任务很有价值。现有研究为微小的NLU任务进行即时调试的工作,主要侧重于用言语生成正确的标签词句,或生成迅速的模板,以从PLMS引出语义。此外,常规的数据增强方法也已被核实对少数任务有用。然而,目前为基于即时的调制模式设计的数据增强方法很少。因此,我们研究为基于即时的少数射击学习者增加数据的新问题。由于标签式调试在基于即时的调中至关重要,因此我们建议一种新型的标签制数据增强方法“PreadDA”,利用浓缩的标签式语义信息来增加数据。关于少量文本分类工作的广泛实验结果显示,我们提议的框架通过有效地利用标签式修饰和自然语言数据增强而取得优越性。

0

相关内容

数据增强

数据增强在机器学习领域多指采用一些方法（比如数据蒸馏，正负样本均衡等）来提高模型数据集的质量，增强数据。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于IIM模型的城市关联基础设施系统的脆弱性与弹性评价研究

国家自然科学基金

1+阅读 · 2015年12月31日

膜下滴灌水稻根系生物学特征及对硝酸盐的响应机制

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋生物碱aaptamines及其衍生物的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

河西走廊典型荒漠植物抗旱结构及其光谱特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于高光谱遥感的雅鲁藏布江源区草地退化及其对全球变化响应敏感性研究

国家自然科学基金

0+阅读 · 2012年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

大沙鼠对荒漠植物和土壤的扰动效应

国家自然科学基金

0+阅读 · 2009年12月31日

脑分布靶向的新疆特色植物药红景天抗脑缺血有效成分及其构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗癌药紫杉醇类似物的合成研究(IV)

国家自然科学基金

0+阅读 · 2008年12月31日

Exploring Representation-Level Augmentation for Code Search

Arxiv

0+阅读 · 2022年10月21日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation

Arxiv

0+阅读 · 2022年10月21日

Continued Pretraining for Better Zero- and Few-Shot Promptability

Arxiv

0+阅读 · 2022年10月21日

QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation

Arxiv

0+阅读 · 2022年10月19日

Prompting through Prototype: A Prototype-based Prompt Learning on Pretrained Vision-Language Models

Arxiv

0+阅读 · 2022年10月19日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Exploring Representation-Level Augmentation for Code Search

Arxiv

0+阅读 · 2022年10月21日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation

Arxiv

0+阅读 · 2022年10月21日

Continued Pretraining for Better Zero- and Few-Shot Promptability

Arxiv

0+阅读 · 2022年10月21日

QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation

Arxiv

0+阅读 · 2022年10月19日

Prompting through Prototype: A Prototype-based Prompt Learning on Pretrained Vision-Language Models

Arxiv

0+阅读 · 2022年10月19日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

相关基金

基于IIM模型的城市关联基础设施系统的脆弱性与弹性评价研究

国家自然科学基金

1+阅读 · 2015年12月31日

膜下滴灌水稻根系生物学特征及对硝酸盐的响应机制

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋生物碱aaptamines及其衍生物的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

河西走廊典型荒漠植物抗旱结构及其光谱特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于高光谱遥感的雅鲁藏布江源区草地退化及其对全球变化响应敏感性研究

国家自然科学基金

0+阅读 · 2012年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

大沙鼠对荒漠植物和土壤的扰动效应

国家自然科学基金

0+阅读 · 2009年12月31日

脑分布靶向的新疆特色植物药红景天抗脑缺血有效成分及其构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗癌药紫杉醇类似物的合成研究(IV)

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员