培训前语言模型更好的少见的学习者 (Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners) - 专知论文

会员服务 ·

0

Prompt · 语言模型化 · Better · 小样本学习 · 学习器 ·

2022 年 2 月 11 日

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

翻译：培训前语言模型更好的少见的学习者

Ningyu Zhang,Luoqiu Li,Xiang Chen,Shumin Deng,Zhen Bi,Chuanqi Tan,Fei Huang,Huajun Chen

from arxiv, Accepted by ICLR 2022

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. Code is available in https://github.com/zjunlp/DART.

翻译：大规模预先培训的语言模式通过作为少见的学习者展示出非凡的能力,极大地促进了自然语言处理,但其效力主要取决于模型参数的扩大和迅速设计,这妨碍了在大多数现实世界应用中应用这些参数。本研究报告提出了一种新型的可插入、可推广和高效的方法,名为“差异、可扩展和普及” (DART),它可以将小语言模式转换成更精细的学习者,而无需任何迅速的工程。这一方法的主要原则是将潜在的自然语言处理任务重新定位为预先培训的语言模式的任务,并有区别地优化快速模板和与反向调整的目标标签。此外,拟议的方法可以:(一) 插到任何预先培训的语言模式中;(二) 扩展到广泛的分类任务。对标准国家语言方案任务的全面评价表明,拟议的方法能够取得更精细的绩效。可在https://github.com/zjunlp/DATRT中查阅。

0

相关内容

Prompt

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

移动互联网的用户隐私保护研究

国家自然科学基金

2+阅读 · 2017年12月31日

大数据高效能存储与管理方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

谷子光周期、温度敏感性及相关性状的全基因组关联分析

国家自然科学基金

0+阅读 · 2014年12月31日

经典-量子协同计算：形式化模型、计算复杂性与模型检测

国家自然科学基金

1+阅读 · 2014年12月31日

肿瘤相关重组抗原蛋白与肿瘤治疗性DNA疫苗静电耦合成为复合纳米颗粒的初步研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

地方性氟中毒神经系统损伤及智力改变的分子发病机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

两种海南特有暗罗属植物抗肿瘤活性物质基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

高效特异清除凋亡细胞的小胶质细胞亚群的鉴定及其免疫生物学功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

Impossible Triangle: What's Next for Pre-trained Language Models?

Arxiv

0+阅读 · 2022年4月20日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Pathologies of Pre-trained Language Models in Few-shot Fine-tuning

Arxiv

1+阅读 · 2022年4月17日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Arxiv

1+阅读 · 2022年4月15日

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

Arxiv

0+阅读 · 2022年4月5日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

VIP会员

文章信息

相关主题

语言模型化

小样本学习

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Impossible Triangle: What's Next for Pre-trained Language Models?

Arxiv

0+阅读 · 2022年4月20日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Pathologies of Pre-trained Language Models in Few-shot Fine-tuning

Arxiv

1+阅读 · 2022年4月17日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Arxiv

1+阅读 · 2022年4月15日

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

Arxiv

0+阅读 · 2022年4月5日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

相关基金

移动互联网的用户隐私保护研究

国家自然科学基金

2+阅读 · 2017年12月31日

大数据高效能存储与管理方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

谷子光周期、温度敏感性及相关性状的全基因组关联分析

国家自然科学基金

0+阅读 · 2014年12月31日

经典-量子协同计算：形式化模型、计算复杂性与模型检测

国家自然科学基金

1+阅读 · 2014年12月31日

肿瘤相关重组抗原蛋白与肿瘤治疗性DNA疫苗静电耦合成为复合纳米颗粒的初步研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

地方性氟中毒神经系统损伤及智力改变的分子发病机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

两种海南特有暗罗属植物抗肿瘤活性物质基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

高效特异清除凋亡细胞的小胶质细胞亚群的鉴定及其免疫生物学功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员