通过 " 梅塔学习 " 为培训前语言模式制定多重培训目标 (Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · Learning · 样本 · Continuity ·

2022 年 10 月 19 日

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

翻译：通过 " 梅塔学习 " 为培训前语言模式制定多重培训目标

Hongqiu Wu,Ruixue Ding,Hai Zhao,Boli Chen,Pengjun Xie,Fei Huang,Min Zhang

from arxiv, EMNLP 2022 (findings)

Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios. However, learning multiple training objectives in a single model is challenging due to the unknown relative significance as well as the potential contrariety between them. Empirical studies have shown that the current objective sampling in an ad-hoc manual setting makes the learned language representation barely converge to the desired optimum. Thus, we propose \textit{MOMETAS}, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives. Such a design is lightweight with negligible additional training overhead. To validate our approach, we adopt five objectives and conduct continual pre-training with BERT-base and BERT-large models, where MOMETAS demonstrates universal performance gain over other rule-based sampling strategies on 14 natural language processing tasks.

翻译：培训前的多重目标填补了单一客观语言模型理解能力的空缺,这种模型服务于培训前语言模型(PrLMS)的最终目的,对各种假设进行全面概括。然而,由于一个单一模型中学习多种培训目标具有未知的相对重要性以及它们之间潜在的矛盾性,因此,在一个单一模型中学习多重培训目标具有挑战性。经验研究表明,在特设手册中目前的客观抽样表明,所学语言代表几乎无法达到理想的最佳程度。因此,我们提议采用基于元学习的新型适应性取样器,以学习关于任意培训前目标的潜在取样模式。这种设计是轻量的,附加培训间接费用微不足道。为了验证我们的方法,我们采用了五个目标,并持续与BERT基地和BERT大模型进行预先培训。 MOMETAS在14项自然语言处理任务方面展示了普遍业绩优于其他基于规则的取样战略。

0

相关内容

语言模型化

语言模型化

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

SnS、SnSe、SnSxSe1-x纳米材料的可控制备与高压研究

国家自然科学基金

0+阅读 · 2015年12月31日

磁性金属纳米片/ZnO核壳结构复合纳米颗粒的制备、结构调控及微波性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

偏氟乙烯聚合物晶体结构调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于改进的Co-Kriging模型的高维气动优化设计新方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型免疫负调控分子TIPE2调控CD4+T细胞的功能及在HBV感染中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁约束聚变装置用陶瓷基绝缘子材料及低温性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

福氏志贺氏菌HtrA蛋白功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁磁/单层分子复合纳米颗粒材料的制备和自旋输运研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

Fair Generative Models via Transfer Learning

Arxiv

0+阅读 · 2022年12月2日

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Arxiv

0+阅读 · 2022年11月30日

Soft Alignment Objectives for Robust Adaptation in Machine Translation

Arxiv

0+阅读 · 2022年11月29日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Fair Generative Models via Transfer Learning

Arxiv

0+阅读 · 2022年12月2日

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Arxiv

0+阅读 · 2022年11月30日

Soft Alignment Objectives for Robust Adaptation in Machine Translation

Arxiv

0+阅读 · 2022年11月29日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

相关基金

SnS、SnSe、SnSxSe1-x纳米材料的可控制备与高压研究

国家自然科学基金

0+阅读 · 2015年12月31日

磁性金属纳米片/ZnO核壳结构复合纳米颗粒的制备、结构调控及微波性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

偏氟乙烯聚合物晶体结构调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于改进的Co-Kriging模型的高维气动优化设计新方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型免疫负调控分子TIPE2调控CD4+T细胞的功能及在HBV感染中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁约束聚变装置用陶瓷基绝缘子材料及低温性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

福氏志贺氏菌HtrA蛋白功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁磁/单层分子复合纳米颗粒材料的制备和自旋输运研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员