通过软感应器的注意力混合体进行参数-有效多任务图象 (ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts) - 专知论文

会员服务 ·

0

Prompt · tuning · Attention · SOFT · 知识 (knowledge) ·

2022 年 12 月 1 日

ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts

翻译：通过软感应器的注意力混合体进行参数-有效多任务图象

Akari Asai,Mohammadreza Salehi,Matthew E. Peters,Hannaneh Hajishirzi

from arxiv, Published as a conference paper at EMNLP 2022 (long). Code available at https://github.com/AkariAsai/ATTEMPT

This work introduces a new multi-task, parameter-efficient language model (LM) tuning method that learns to transfer knowledge across different tasks via a mixture of soft prompts-small prefix embedding vectors pre-trained for different tasks. Our method, called ATTEMPT (ATTEntional Mixtures of Prompt Tuning), obtains source prompts as encodings of large-scale source tasks into a small number of parameters and trains an attention module to interpolate the source prompts and a newly initialized target prompt for every instance in the target task. During training, only the target task prompt and the attention weights, which are shared between tasks in multi-task training, are updated, while the original LM and source prompts are intact. ATTEMPT is highly parameter-efficient (e.g., updates 2,300 times fewer parameters than full fine-tuning) while achieving high task performance using knowledge from high-resource tasks. Moreover, it is modular using pre-trained soft prompts, and can flexibly add or remove source prompts for effective knowledge transfer. Our experimental results across 21 diverse NLP datasets show that ATTEMPT significantly outperforms prompt tuning and outperforms or matches fully fine-tuned or other parameter-efficient tuning approaches that use over ten times more parameters. Finally, ATTEMPT outperforms previous work in few-shot learning settings.

翻译：这项工作引入了新的多任务、参数效率语言模型( LM) 调制方法, 学习如何通过多种任务前训练的软提示- 小型前缀嵌入矢量的混合, 传递不同任务的知识。我们的方法叫做 ATTEPT( 快速调试的属性混合), 获取源代码提示, 将大型源任务编码成少量参数, 并训练一个关注模块, 将源代码提示和新启动的目标插入到目标任务中。培训期间, 仅更新目标任务提示和关注权重, 在多任务培训中共享的目标任务提示和关注权重, 而原始 LMTAPT 和源提示是完好的。 ATMEPT是高参数效率的( 例如, 更新比全面调整的参数少2 300倍 ), 同时使用高资源任务知识实现高的任务性业绩。此外, 它是模块化的, 使用预先培训的软提示, 可以灵活添加或删除源代码, 以有效知识转移。我们21个不同的 NLPTAST- 的实验结果, 更新了其它快速调整的校正。

0

相关内容

Prompt

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

S1PR1和VEGF受体在急性心肌梗死后血管新生中的交互作用

国家自然科学基金

0+阅读 · 2014年12月31日

一种基于共振隧道二极管的紧凑型太赫兹信号生成方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB4通路激活介导非小细胞肺癌EGFR-TKIs获得性耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

临近空间高超声速目标宽带电磁特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

从ERK1/2和p38信号通路及其交互作用研究MEBT/MEBO促进慢性难愈合创面修复的机制

国家自然科学基金

0+阅读 · 2012年12月31日

AGEs/RAGE信号启动糖尿病动脉粥样硬化内膜钙化机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

HGF诱导NSCLC细胞对EGFR-TKIs耐药机制的研究。

国家自然科学基金

0+阅读 · 2011年12月31日

脂肪肝内局灶性病变超声回声机制及超声造影对其血管生成的研究

国家自然科学基金

0+阅读 · 2011年12月31日

自噬信号在深II度烧伤创面早期进行性加深中的作用的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

TRAIL在动脉粥样硬化发生发展中作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

MetaPrompting: Learning to Learn Better Prompts

Arxiv

0+阅读 · 2023年2月3日

An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning

Arxiv

0+阅读 · 2023年2月1日

What Makes Good Examples for Visual In-Context Learning?

Arxiv

0+阅读 · 2023年2月1日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

MetaPrompting: Learning to Learn Better Prompts

Arxiv

0+阅读 · 2023年2月3日

An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning

Arxiv

0+阅读 · 2023年2月1日

What Makes Good Examples for Visual In-Context Learning?

Arxiv

0+阅读 · 2023年2月1日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

S1PR1和VEGF受体在急性心肌梗死后血管新生中的交互作用

国家自然科学基金

0+阅读 · 2014年12月31日

一种基于共振隧道二极管的紧凑型太赫兹信号生成方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB4通路激活介导非小细胞肺癌EGFR-TKIs获得性耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

临近空间高超声速目标宽带电磁特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

从ERK1/2和p38信号通路及其交互作用研究MEBT/MEBO促进慢性难愈合创面修复的机制

国家自然科学基金

0+阅读 · 2012年12月31日

AGEs/RAGE信号启动糖尿病动脉粥样硬化内膜钙化机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

HGF诱导NSCLC细胞对EGFR-TKIs耐药机制的研究。

国家自然科学基金

0+阅读 · 2011年12月31日

脂肪肝内局灶性病变超声回声机制及超声造影对其血管生成的研究

国家自然科学基金

0+阅读 · 2011年12月31日

自噬信号在深II度烧伤创面早期进行性加深中的作用的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

TRAIL在动脉粥样硬化发生发展中作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员