METICL: 学习从背景中学习 (MetaICL: Learning to Learn In Context) - 专知论文

会员服务 ·

0

学成 · 学习的学习 · Performer · MoDELS · domain shift ·

2022 年 5 月 3 日

MetaICL: Learning to Learn In Context

翻译：METICL: 学习从背景中学习

Sewon Min,Mike Lewis,Luke Zettlemoyer,Hannaneh Hajishirzi

from arxiv, 19 pages, 2 figures. Published as a conference paper at NAACL 2022 (long). Code available at https://github.com/facebookresearch/MetaICL

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks. This meta-training enables the model to more effectively learn a new task in context at test time, by simply conditioning on a few training examples with no parameter updates or task-specific templates. We experiment on a large, diverse collection of tasks consisting of 142 NLP datasets including classification, question answering, natural language inference, paraphrase detection and more, across seven different meta-training/target splits. MetaICL outperforms a range of baselines including in-context learning without meta-training and multi-task learning followed by zero-shot transfer. We find that the gains are particularly significant for target tasks that have domain shifts from the meta-training tasks, and that using a diverse set of the meta-training tasks is key to improvements. We also show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task, and outperforms much bigger models with nearly 8x parameters. Finally, we show that MetaICL is complementary to human-written instructions, and the best performance can be achieved by combining both approaches.

翻译：我们引入了MetaICL(Meta-training for Intextlearning)(Meta-traination for Intextlearning),这是一个用于微小学习的新的元培训框架,通过对预先培训的语言模式进行调整,在大量培训任务中进行文字化学习。这种元培训使该模式能够在测试时更有效地学习新的任务,只要以几个培训范例为条件,而没有参数更新或任务特定模板即可。我们试验了由142个NLP数据集组成的庞大而多样的任务汇编,其中包括分类、问答、自然语言推论、语音探测和七个不同的元培训/目标分割。MetaICL超越了一系列基线,包括不进行元培训和多任务化学习,然后是零点转移。我们发现,这些收益对于目标任务来说特别重要,而没有参数更新,使用一套多样化的元培训任务是改进的关键。我们还表明,MetICL(有时击败)方法能够完全调整模型在目标任务上的表现,超越一系列基准,包括文中学习,没有进行元培训和多课学习,而没有零点转移。我们发现,将模型与最接近8个模型结合起来。

0

相关内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

ARVCF调节cadherin/catenin复合体介导的细胞间黏附的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

ARHGAP9基因在肝癌侵袭转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

E3泛素连接酶CUL7修饰Caspase-8调节乳腺癌细胞生存的研究

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

金属离子增强甲基汞去甲基化作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微波场中纳米氧化物的溶液燃烧合成过程及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR激活对吗啡耐受的调控及其分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

以EGFR为识别靶位多靶点联合克服NSCLC EGFR TKIs耐药的基因干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

过热金属熔体行为与非晶形成能力

国家自然科学基金

0+阅读 · 2008年12月31日

Learngene: From Open-World to Your Learning Task

Arxiv

0+阅读 · 2022年6月17日

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning

Arxiv

0+阅读 · 2022年6月17日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

学习的学习

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Learngene: From Open-World to Your Learning Task

Arxiv

0+阅读 · 2022年6月17日

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning

Arxiv

0+阅读 · 2022年6月17日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

ARVCF调节cadherin/catenin复合体介导的细胞间黏附的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

ARHGAP9基因在肝癌侵袭转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

E3泛素连接酶CUL7修饰Caspase-8调节乳腺癌细胞生存的研究

国家自然科学基金

0+阅读 · 2013年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

金属离子增强甲基汞去甲基化作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微波场中纳米氧化物的溶液燃烧合成过程及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR激活对吗啡耐受的调控及其分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

以EGFR为识别靶位多靶点联合克服NSCLC EGFR TKIs耐药的基因干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

过热金属熔体行为与非晶形成能力

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员