通过多语种适应性微调调调控,使培训前语言模式适应非洲语言 (Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning) - 专知论文

会员服务 ·

0

语言模型化 · Performer · MoDELS · 可约的 · XLM-R ·

2022 年 10 月 18 日

Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning

翻译：通过多语种适应性微调调调控,使培训前语言模式适应非洲语言

Jesujoba O. Alabi,David Ifeoluwa Adelani,Marius Mosbach,Dietrich Klakow

from arxiv, Accepted to COLING 2022

Multilingual pre-trained language models (PLMs) have demonstrated impressive performance on several downstream tasks for both high-resourced and low-resourced languages. However, there is still a large performance drop for languages unseen during pre-training, especially African languages. One of the most effective approaches to adapt to a new language is \textit{language adaptive fine-tuning} (LAFT) -- fine-tuning a multilingual PLM on monolingual texts of a language using the pre-training objective. However, adapting to a target language individually takes a large disk space and limits the cross-lingual transfer abilities of the resulting models because they have been specialized for a single language. In this paper, we perform \textit{multilingual adaptive fine-tuning} on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent to encourage cross-lingual transfer learning. To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT, thus reducing the model size by around 50%. Our evaluation on two multilingual PLMs (AfriBERTa and XLM-R) and three NLP tasks (NER, news topic classification, and sentiment classification) shows that our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space. Additionally, we show that our adapted PLM also improves the zero-shot cross-lingual transfer abilities of parameter efficient fine-tuning methods.

翻译：多种语言培训前的多语言模式(PLM)在资源丰富和资源少的语言的几项下游任务上表现出了令人印象深刻的成绩。然而,在培训前的训练前,对所见语言,特别是非洲语言,仍然有大量的成绩下降。适应新语言的最有效办法之一是在非洲大陆广泛使用17种最富有的非洲语言和三种其他高资源语言,鼓励跨语言的学习。为了进一步专门使用培训前目标,我们删除了用于一种语言的多语言的多语言多语种的PLM(PLM)单语种单语种的单语种文本。然而,适应一种目标语言需要很大的磁盘空间空间空间空间空间空间空间空间空间空间空间空间空间,从而将模式的跨语言传输能力限制在50%左右。我们对于两种多语言的多语言 PLM(AfribTA 和 XLM-FA)的评估显示我们个人语言的竞争性分类方法。

0

相关内容

语言模型化

语言模型化

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

南极Dome A综合大气光学湍流参数测量方法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于改进的Co-Kriging模型的高维气动优化设计新方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

多参数有机融合的病症自适应多模态颅内压无创综合评估框架研究

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA调控hERG基因表达的机制及其在药物获得性长QT综合征的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于模板识别的生物分子传感新技术

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

Ad Hoc网络中基于分布式异步带内信道预约机制的多址接入协议研究

国家自然科学基金

0+阅读 · 2008年12月31日

Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Zero Shot Action Generation

Arxiv

0+阅读 · 2022年11月28日

Finetuning BERT on Partially Annotated NER Corpora

Arxiv

0+阅读 · 2022年11月25日

Using Selective Masking as a Bridge between Pre-training and Fine-tuning

Arxiv

0+阅读 · 2022年11月24日

The adaptive Levin method

Arxiv

0+阅读 · 2022年11月24日

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages

Arxiv

0+阅读 · 2022年11月23日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Zero Shot Action Generation

Arxiv

0+阅读 · 2022年11月28日

Finetuning BERT on Partially Annotated NER Corpora

Arxiv

0+阅读 · 2022年11月25日

Using Selective Masking as a Bridge between Pre-training and Fine-tuning

Arxiv

0+阅读 · 2022年11月24日

The adaptive Levin method

Arxiv

0+阅读 · 2022年11月24日

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages

Arxiv

0+阅读 · 2022年11月23日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

相关基金

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

南极Dome A综合大气光学湍流参数测量方法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于改进的Co-Kriging模型的高维气动优化设计新方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

多参数有机融合的病症自适应多模态颅内压无创综合评估框架研究

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA调控hERG基因表达的机制及其在药物获得性长QT综合征的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于模板识别的生物分子传感新技术

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

Ad Hoc网络中基于分布式异步带内信道预约机制的多址接入协议研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员