Met-Lain-Linning 变换器的普通目的内文学习 (General-Purpose In-Context Learning by Meta-Learning Transformers) - 专知论文

会员服务 ·

0

Learning · MoDELS · 变换 · 有偏 · 黑盒 ·

2022 年 12 月 8 日

General-Purpose In-Context Learning by Meta-Learning Transformers

翻译：Met-Lain-Linning 变换器的普通目的内文学习

Louis Kirsch,James Harrison,Jascha Sohl-Dickstein,Luke Metz

from arxiv, Published at the NeurIPS 2022 Workshop on Meta-Learning. Full version currently under review

Modern machine learning requires system designers to specify aspects of the learning pipeline, such as losses, architectures, and optimizers. Meta-learning, or learning-to-learn, instead aims to learn those aspects, and promises to unlock greater capabilities with less manual effort. One particularly ambitious goal of meta-learning is to train general-purpose in-context learning algorithms from scratch, using only black-box models with minimal inductive bias. Such a model takes in training data, and produces test-set predictions across a wide range of problems, without any explicit definition of an inference model, training loss, or optimization algorithm. In this paper we show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners. We characterize phase transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all, induced by changes in model size, number of tasks, and meta-optimization. We further show that the capabilities of meta-trained algorithms are bottlenecked by the accessible state size (memory) determining the next prediction, unlike standard models which are thought to be bottlenecked by parameter count. Finally, we propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose learning algorithms.

翻译：现代机器学习要求系统设计者指定学习管道的方方面面,如损失、建筑和优化。元学习或学习到学习,相反的目的是学习这些方面,并承诺以较少人工努力释放更大的能力。元学习的一个特别雄心勃勃的目标是从零开始培训一般目的的内流学习算法,只使用带有最小感知偏差的黑盒模型。这种模型在培训数据中采用,并在一系列广泛的问题中进行测试集成预测,而没有对推论模型、培训损失或优化算法作任何明确定义。在本文中,我们显示变换器和其他黑盒模型可以接受元培训,以作为通用的同源学习者。我们把演算法分为两个阶段,一是通用算法,一是记忆的算法,二是模型规模、任务数量和元理解性算法。我们进一步表明,元训练算算法的能力被可理解的状态大小(模范式)所束缚。我们的模型和元化算算算法,最后是标准式的变校程, 与模型不同的是,我们思考的公式比标准的变式模型,最后的算方法是用来分析。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高转矩密度轴向磁场无刷记忆电机及其控制系统研究

国家自然科学基金

0+阅读 · 2015年12月31日

多功能诊疗分子探针多模态显像与治疗乳腺癌

国家自然科学基金

0+阅读 · 2013年12月31日

基于单细胞拉曼技术的海水固碳微生物的分选及固碳机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

血管内超声-OCT 内窥镜仪的研制

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

基于格理论可证明安全公钥密码算法的研究与设计

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

微通道内气液两相流及传质特性

国家自然科学基金

0+阅读 · 2008年12月31日

In-Context Learning with Many Demonstration Examples

Arxiv

0+阅读 · 2023年2月9日

Explanation Selection Using Unlabeled Data for In-Context Learning

Explanation Selection Using Unlabeled Data for In-Context Learning

Arxiv

0+阅读 · 2023年2月9日

Using Language Models for Enhancing the Completeness of Natural-language Requirements

Using Language Models for Enhancing the Completeness of Natural-language Requirements

Arxiv

0+阅读 · 2023年2月9日

Zero-Shot Learning for Requirements Classification: An Exploratory Study

Arxiv

0+阅读 · 2023年2月9日

Investigating the role of model-based learning in exploration and transfer

Arxiv

0+阅读 · 2023年2月8日

A Survey on In-context Learning

Arxiv

1+阅读 · 2023年2月8日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Meta-Learning: A Survey

Arxiv

136+阅读 · 2018年10月8日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于大型语言模型的软件工程自动化研究》最新264页

《基于大型语言模型的信号处理管线研究：推进军事电子情报工作流程》最新76页

中文版 | 战争算法：生成式人工智能在战场的崛起

中文版《美国陆军：战术行为性远程医疗实施观察与建议》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

In-Context Learning with Many Demonstration Examples

Arxiv

0+阅读 · 2023年2月9日

Explanation Selection Using Unlabeled Data for In-Context Learning

Explanation Selection Using Unlabeled Data for In-Context Learning

Arxiv

0+阅读 · 2023年2月9日

Using Language Models for Enhancing the Completeness of Natural-language Requirements

Using Language Models for Enhancing the Completeness of Natural-language Requirements

Arxiv

0+阅读 · 2023年2月9日

Zero-Shot Learning for Requirements Classification: An Exploratory Study

Arxiv

0+阅读 · 2023年2月9日

Investigating the role of model-based learning in exploration and transfer

Arxiv

0+阅读 · 2023年2月8日

A Survey on In-context Learning

Arxiv

1+阅读 · 2023年2月8日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Meta-Learning: A Survey

Arxiv

136+阅读 · 2018年10月8日

相关基金

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高转矩密度轴向磁场无刷记忆电机及其控制系统研究

国家自然科学基金

0+阅读 · 2015年12月31日

多功能诊疗分子探针多模态显像与治疗乳腺癌

国家自然科学基金

0+阅读 · 2013年12月31日

基于单细胞拉曼技术的海水固碳微生物的分选及固碳机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

血管内超声-OCT 内窥镜仪的研制

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

基于格理论可证明安全公钥密码算法的研究与设计

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

微通道内气液两相流及传质特性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员