在低遗忘风险地区内持续学习神经机器翻译 (Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions) - 专知论文

会员服务 ·

0

Continuity · Machine Translation · Learning · MoDELS · Performer ·

2022 年 11 月 4 日

Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions

翻译：在低遗忘风险地区内持续学习神经机器翻译

Shuhao Gu,Bojie Hu,Yang Feng

from arxiv, EMNLP 2022 Main Conference Long Paper

This paper considers continual learning of large-scale pretrained neural machine translation model without accessing the previous training data or introducing model separation. We argue that the widely used regularization-based methods, which perform multi-objective learning with an auxiliary loss, suffer from the misestimate problem and cannot always achieve a good balance between the previous and new tasks. To solve the problem, we propose a two-stage training method based on the local features of the real loss. We first search low forgetting risk regions, where the model can retain the performance on the previous task as the parameters are updated, to avoid the catastrophic forgetting problem. Then we can continually train the model within this region only with the new training data to fit the new task. Specifically, we propose two methods to search the low forgetting risk regions, which are based on the curvature of loss and the impacts of the parameters on the model output, respectively. We conduct experiments on domain adaptation and more challenging language adaptation tasks, and the experimental results show that our method can achieve significant improvements compared with several strong baselines.

翻译：本文考虑在没有获得先前的培训数据或引入模型分离的情况下继续学习大规模预先训练的神经机器翻译模型。我们认为,广泛使用的基于正规化的方法(这些方法进行多目标学习,附带损失)存在误估问题,不能总是在以往和新任务之间取得良好的平衡。为了解决问题,我们建议根据实际损失的当地特点,采用两阶段培训方法。我们首先寻找低忘却风险区域,在更新参数时,该模型可以保留前一项任务的业绩,以避免灾难性的遗忘问题。然后,我们只能用新的培训数据在这个区域内不断培训模型,以适应新的任务。具体地说,我们提出两种方法,分别根据损失的曲线和参数对模型输出的影响,分别寻找低遗忘风险区域。我们在领域适应和更具挑战性的语言适应任务方面进行了实验,实验结果显示,与几个强有力的基线相比,我们的方法可以取得显著改进。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于最优化理论的大气气溶胶偏振遥感反演方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

白色念珠菌中与形态发生相关的蛋白磷酸酶功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于主动微波遥感数据和光学遥感数据的干旱区绿洲棉花地表多尺度土壤湿度反演研究

国家自然科学基金

0+阅读 · 2013年12月31日

利用多波段光谱反演对流层臭氧的建模研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于参数化建模与数据分析的大跨建筑结构形态推演

国家自然科学基金

0+阅读 · 2012年12月31日

基于MODIS BRDF产品的叶片聚集度系数遥感反演与验证研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化磁性纳米颗粒/铁电聚合物复合微球的微流方法可控制备及改性研究

国家自然科学基金

0+阅读 · 2012年12月31日

Saliency-Augmented Memory Completion for Continual Learning

Saliency-Augmented Memory Completion for Continual Learning

Arxiv

0+阅读 · 2022年12月26日

Generalization Bounds for Transfer Learning with Pretrained Classifiers

Arxiv

0+阅读 · 2022年12月23日

Analysis of Distributed Deep Learning in the Cloud

Arxiv

0+阅读 · 2022年12月22日

Controlling Styles in Neural Machine Translation with Activation Prompt

Arxiv

0+阅读 · 2022年12月17日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Graph Contrastive Learning with Adaptive Augmentation

Arxiv

10+阅读 · 2021年2月26日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

发射器定位中的传感器路径规划研究 | 235页

战略无人机 | 2025最新80页

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

无人机对机动战的影响 | 2025最新文献

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Saliency-Augmented Memory Completion for Continual Learning

Saliency-Augmented Memory Completion for Continual Learning

Arxiv

0+阅读 · 2022年12月26日

Generalization Bounds for Transfer Learning with Pretrained Classifiers

Arxiv

0+阅读 · 2022年12月23日

Analysis of Distributed Deep Learning in the Cloud

Arxiv

0+阅读 · 2022年12月22日

Controlling Styles in Neural Machine Translation with Activation Prompt

Arxiv

0+阅读 · 2022年12月17日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Graph Contrastive Learning with Adaptive Augmentation

Arxiv

10+阅读 · 2021年2月26日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于最优化理论的大气气溶胶偏振遥感反演方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

白色念珠菌中与形态发生相关的蛋白磷酸酶功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于主动微波遥感数据和光学遥感数据的干旱区绿洲棉花地表多尺度土壤湿度反演研究

国家自然科学基金

0+阅读 · 2013年12月31日

利用多波段光谱反演对流层臭氧的建模研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于参数化建模与数据分析的大跨建筑结构形态推演

国家自然科学基金

0+阅读 · 2012年12月31日

基于MODIS BRDF产品的叶片聚集度系数遥感反演与验证研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化磁性纳米颗粒/铁电聚合物复合微球的微流方法可控制备及改性研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员