很少热学习模块提示多任务培训前 (Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning) - 专知论文

会员服务 ·

0

Prompt · tuning · Learning · 小样本学习 · 全 ·

2022 年 10 月 14 日

Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning

翻译：很少热学习模块提示多任务培训前

Tianxiang Sun,Zhengfu He,Qin Zhu,Xipeng Qiu,Xuanjing Huang

Prompt tuning is a parameter-efficient approach to adapting pre-trained language models to downstream tasks. Although prompt tuning has been shown to match the performance of full model tuning when training data is sufficient, it tends to struggle in few-shot learning settings. In this paper, we present Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot learning. MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks. On downstream tasks, the pre-trained prompts are selectively activated and combined, leading to strong compositional generalization to unseen tasks. To bridge the gap between pre-training and fine-tuning, we formulate upstream and downstream tasks into a unified machine reading comprehension task. Extensive experiments under two learning paradigms, i.e., gradient descent and black-box tuning, show that MP2 significantly outperforms prompt tuning, full model tuning, and prior prompt pre-training methods in few-shot settings. In addition, we demonstrate that MP2 can achieve surprisingly fast and strong adaptation to downstream tasks by merely learning 8 parameters to combine the pre-trained modular prompts.

翻译：快速调试是使经过培训的语文模式适应下游任务的一种具有参数效率的方法。虽然在培训数据充足时,快速调试已经证明与完全模型调试的性能相匹配,但往往会在几发学习环境中挣扎。在本文件中,我们介绍了多任务预调模块(MP2),以加快对微粒学习的快速调试。MP2是一套对38个中国任务进行预先培训的可燃提示。在下游任务中,预先培训的提示被有选择地激活和组合,导致对看不见任务进行强有力的组合化概括化。为了缩小培训前和微调之间的差距,我们将上游和下游任务发展成一个统一的机器阅读理解任务。在两种学习模式(即梯度下下移和黑盒调)下进行的广泛实验表明,MP2在微粒情况下大大超过快速调、完全模型调试用以及先前的快速培训前方法。此外,我们证明,MP2可以通过仅仅学习8项参数,将经过培训前模块的及时性综合起来,从而对下游任务作出惊人的快速和有力的调整。

0

相关内容

Prompt

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

梯度核壳粒子的制备及其对聚合物增强增韧机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于空间谱群指纹的高精度室内定位方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

MicroRNA-34b/c基因启动子区多态性及其甲基化与肾癌遗传易感性及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肺组织内结核分枝杆菌抗原多肽疫苗保护效果的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于北斗卫星导航系统的ZigBee无线网络位置服务研究

国家自然科学基金

0+阅读 · 2012年12月31日

急性白血病新基因MLAA-34及其多肽疫苗的研究

国家自然科学基金

0+阅读 · 2011年12月31日

异构无线传感执行器网络MAC协议不对称竞争接入控制算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

仿人多指手的多维指尖力感知和同步控制

国家自然科学基金

0+阅读 · 2011年12月31日

增强现实中多目标3D跟踪定位和WH-SIFT特征识别方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Arxiv

0+阅读 · 2022年11月17日

Prompt Tuning for Parameter-efficient Medical Image Segmentation

Arxiv

2+阅读 · 2022年11月16日

Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Arxiv

0+阅读 · 2022年11月16日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Arxiv

0+阅读 · 2022年11月17日

Prompt Tuning for Parameter-efficient Medical Image Segmentation

Arxiv

2+阅读 · 2022年11月16日

Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Arxiv

0+阅读 · 2022年11月16日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

相关基金

梯度核壳粒子的制备及其对聚合物增强增韧机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于空间谱群指纹的高精度室内定位方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

MicroRNA-34b/c基因启动子区多态性及其甲基化与肾癌遗传易感性及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肺组织内结核分枝杆菌抗原多肽疫苗保护效果的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于北斗卫星导航系统的ZigBee无线网络位置服务研究

国家自然科学基金

0+阅读 · 2012年12月31日

急性白血病新基因MLAA-34及其多肽疫苗的研究

国家自然科学基金

0+阅读 · 2011年12月31日

异构无线传感执行器网络MAC协议不对称竞争接入控制算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

仿人多指手的多维指尖力感知和同步控制

国家自然科学基金

0+阅读 · 2011年12月31日

增强现实中多目标3D跟踪定位和WH-SIFT特征识别方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员