阿拉伯文预培训语言模式的变式、大小和任务类型之间的交互作用 (The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 张成子空间 · 优化器 · 情景 ·

2021 年 3 月 11 日

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

翻译：阿拉伯文预培训语言模式的变式、大小和任务类型之间的交互作用

Go Inoue,Bashar Alhafni,Nurpeiis Baimukan,Houda Bouamor,Nizar Habash

from arxiv, Accepted to WANLP 2021

In this paper, we explore the effects of language variants, data sizes, and fine-tuning task types in Arabic pre-trained language models. To do so, we build three pre-trained language models across three variants of Arabic: Modern Standard Arabic (MSA), dialectal Arabic, and classical Arabic, in addition to a fourth language model which is pre-trained on a mix of the three. We also examine the importance of pre-training data size by building additional models that are pre-trained on a scaled-down set of the MSA variant. We compare our different models to each other, as well as to eight publicly available models by fine-tuning them on five NLP tasks spanning 12 datasets. Our results suggest that the variant proximity of pre-training data to fine-tuning data is more important than the pre-training data size. We exploit this insight in defining an optimized system selection model for the studied tasks.

翻译：在本文中,我们探讨了阿拉伯文经培训前语文模式中语言变异、数据大小和微调任务类型的影响。为此,我们建立了三个经过培训的阿拉伯文三种变异三种语文模式:现代标准阿拉伯文(MSA)、方言阿拉伯文和古典阿拉伯文,以及根据三种变异组合预先培训的第四种语文模式。我们还研究了培训前数据规模的重要性,为此,我们根据经扩大的MSA变异模式,建立了经过预先培训的其他模式。我们相互比较了我们的不同模式,以及8种公开存在的模式,对涉及12个数据集的NLP任务进行了微调。我们的结果表明,培训前数据与微调数据的异接近比培训前数据规模更重要。我们利用这种洞察力,为研究中的任务确定了一个优化的系统选择模式。

0

相关内容

语言模型化

语言模型化

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

专知会员服务

133+阅读 · 2019年10月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

语言模型化

张成子空间

相关VIP内容

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

专知会员服务

133+阅读 · 2019年10月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员