强力持续多语种学习的参数有效微调 (Parameter-Efficient Finetuning for Robust Continual Multilingual Learning) - 专知论文

会员服务 ·

0

Continuity · Performer · Learning · 稳健性 · 可约的 ·

2022 年 12 月 19 日

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

翻译：强力持续多语种学习的参数有效微调

Kartikeya Badola,Shachi Dave,Partha Talukdar

We study the underexplored problem of Continual Multilingual Learning, where a multilingual model, already trained on task-specific data from all supported languages, is continually updated using batches of new multilingual training data for the same task. We show that naively updating the multilingual model can lead to losses in performance over a subset of languages although the aggregated performance metric shows an improvement. We establish this phenomenon over four tasks belonging to three task families (token-level, sentence-level and seq2seq). We then build upon recent advances in parameter-efficient finetuning to develop novel finetuning strategies that allow us to jointly minimize language-specific forgetting while encouraging positive cross-lingual transfer observed in this setup. Our proposed pipeline, LAFT-URIEL, improves the spread of gains over the supported languages while reducing the magnitude of language-specific losses incurred.

翻译：我们研究未得到充分探讨的多语文持续学习问题,即已经接受过来自所有辅助语文的具体任务数据培训的多语文模式,正在利用一系列新的多语文培训数据不断更新,用于同一任务;我们表明,对多语文模式进行天真地更新,可能会导致对一组语文的性能损失,尽管综合性能指标显示情况有所改善;我们将这种现象确定为属于三个任务组的四项任务(一级、判决一级和后续一级);然后,我们利用最近在参数效率微调方面取得的进展,制定新的微调战略,使我们能够共同尽量减少特定语文的忘却,同时鼓励在这一设置中观察到积极的跨语文转移;我们提议的编审方案LAFT-URIEL改进了在所支助语文上所获收益的分布,同时减少了特定语文损失的程度。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于多元互信息和快速稀疏多核学习的高光谱遥感影像地物分类

国家自然科学基金

0+阅读 · 2015年12月31日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

人滋养层细胞表面抗原-2调控胆囊癌细胞增殖、侵袭和转移的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

抵抗素在膀胱癌发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于富勒烯两亲性两嵌段共轭聚合物超薄薄膜的光伏器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

多源地理数据集成评估中目标的形式化建模及适应性信息融合方法

国家自然科学基金

0+阅读 · 2012年12月31日

脉络膜新生血管疾病中HTRA1基因的表观遗传学机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

A Kernel-Based View of Language Model Fine-Tuning

Arxiv

0+阅读 · 2023年2月17日

DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization

Arxiv

0+阅读 · 2023年2月16日

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2023年2月16日

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Arxiv

0+阅读 · 2023年2月16日

Learning Performance-Improving Code Edits

Arxiv

0+阅读 · 2023年2月15日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Kernel-Based View of Language Model Fine-Tuning

Arxiv

0+阅读 · 2023年2月17日

DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization

Arxiv

0+阅读 · 2023年2月16日

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2023年2月16日

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Arxiv

0+阅读 · 2023年2月16日

Learning Performance-Improving Code Edits

Arxiv

0+阅读 · 2023年2月15日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

相关基金

基于多元互信息和快速稀疏多核学习的高光谱遥感影像地物分类

国家自然科学基金

0+阅读 · 2015年12月31日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

人滋养层细胞表面抗原-2调控胆囊癌细胞增殖、侵袭和转移的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

抵抗素在膀胱癌发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于富勒烯两亲性两嵌段共轭聚合物超薄薄膜的光伏器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

多源地理数据集成评估中目标的形式化建模及适应性信息融合方法

国家自然科学基金

0+阅读 · 2012年12月31日

脉络膜新生血管疾病中HTRA1基因的表观遗传学机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员