内文学习蒸馏:转让未受过训练的语文模式的微小的学习能力 (In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models) - 专知论文

会员服务 ·

0

Learning · 语言模型化 · MoDELS · 小样本学习 · Performer ·

2022 年 12 月 20 日

In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models

翻译：内文学习蒸馏:转让未受过训练的语文模式的微小的学习能力

Yukun Huang,Yanda Chen,Zhou Yu,Kathleen McKeown

Given the success with in-context learning of large pre-trained language models, we introduce in-context learning distillation to transfer in-context few-shot learning ability from large models to smaller models. We propose to combine in-context learning objectives with language modeling objectives to distill both the ability to read in-context examples and task knowledge to the smaller models. We perform in-context learning distillation under two different few-shot learning paradigms: Meta In-context Tuning (Meta-ICT) and Multitask In-context Tuning (Multitask-ICT). Multitask-ICT performs better on multitask few-shot learning but also requires more computation than Meta-ICT. Our method shows consistent improvements for both Meta-ICT and Multitask-ICT on two benchmarks: LAMA and CrossFit. Our extensive experiments and analysis reveal that in-context learning objectives and language modeling objectives are complementary under the Multitask-ICT paradigm. In-context learning objectives achieve the best performance when combined with language modeling objectives.

翻译：鉴于在经过培训的大型语言模型的文字内学习取得成功,我们引入了文字内学习蒸馏法,将大模型的文字内学习能力带给小模型,我们提议将文字内学习目标与语言模型目标结合起来,将读文字内实例和任务知识的能力提炼给小模型。我们在两个不同的微小学习模式:Meta Intext Tutinning(Meta-ICT)和Multitask In-contle Tutinning(Multitask-ICT)下进行文内学习蒸馏。多任务-ICT在多任务略学方面表现更好,但也需要比Meta-ICT更多的计算。我们的方法显示Meta-IC和Multitask-IC在LAM和CrossFit这两个基准上不断改进。我们的广泛实验和分析显示,文中学习目标和语言模型目标在Multitask-ICT模式下是相辅相成的。内学习目标在与语言模型目标相结合时取得最佳业绩。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

钙钛矿锰氧化物外延界面的输运和光电性质及其外场调控

国家自然科学基金

0+阅读 · 2015年12月31日

天空地一体化导航增强动态自组网模型及应用模式

国家自然科学基金

0+阅读 · 2014年12月31日

偕二氟取代Combretastatins衍生物的设计与合成

国家自然科学基金

0+阅读 · 2014年12月31日

晚期汉-英二语者句法加工的调节机制：行为与ERP研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米氧化铜/磁性金属/碳纤维复合材料原位生长及其吸波机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型LSD1/HDACs双靶点抑制剂的设计、合成及抗肿瘤活性评价

国家自然科学基金

0+阅读 · 2013年12月31日

基于自由曲面透镜的高品质LED光源设计与制造

国家自然科学基金

0+阅读 · 2013年12月31日

适应性建筑表皮的多目标优化模型

国家自然科学基金

0+阅读 · 2013年12月31日

BCR/ABL-HDAC双靶点抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

VIP会员

文章信息

相关主题

语言模型化

小样本学习

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】机器学习的信息论基础

视觉通用模型综述

重审扩散模型：从生成式预训练到一步生成

大模型推理的天花板在哪里？

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

相关论文

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

相关基金

钙钛矿锰氧化物外延界面的输运和光电性质及其外场调控

国家自然科学基金

0+阅读 · 2015年12月31日

天空地一体化导航增强动态自组网模型及应用模式

国家自然科学基金

0+阅读 · 2014年12月31日

偕二氟取代Combretastatins衍生物的设计与合成

国家自然科学基金

0+阅读 · 2014年12月31日

晚期汉-英二语者句法加工的调节机制：行为与ERP研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米氧化铜/磁性金属/碳纤维复合材料原位生长及其吸波机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型LSD1/HDACs双靶点抑制剂的设计、合成及抗肿瘤活性评价

国家自然科学基金

0+阅读 · 2013年12月31日

基于自由曲面透镜的高品质LED光源设计与制造

国家自然科学基金

0+阅读 · 2013年12月31日

适应性建筑表皮的多目标优化模型

国家自然科学基金

0+阅读 · 2013年12月31日

BCR/ABL-HDAC双靶点抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员