关于非常大型英语语言模式多语文能力 (On the Multilingual Capabilities of Very Large-Scale English Language Models) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 缩放 · 可理解性 · 小样本学习 ·

2021 年 8 月 30 日

On the Multilingual Capabilities of Very Large-Scale English Language Models

翻译：关于非常大型英语语言模式多语文能力

Jordi Armengol-Estapé,Ona de Gibert Bonet,Maite Melero

Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented sizes in the history of machine learning. These models, solely trained on the language modeling objective, have been shown to exhibit outstanding few-shot learning capabilities in a number of different tasks. Nevertheless, aside from anecdotal experiences, little is known regarding their multilingual capabilities, given the fact that the pre-training corpus is almost entirely composed of English text. In this work, we investigate the multilingual skills of GPT-3, focusing on one language that barely appears in the pre-training corpus, Catalan, which makes the results especially meaningful; we assume that our results may be relevant for other languages as well. We find that the model shows an outstanding performance, particularly in generative tasks, with predictable limitations mostly in language understanding tasks but still with remarkable results given the zero-shot scenario. We investigate its potential and limits in extractive question-answering and natural language generation, as well as the effect of scale in terms of model size.

翻译：最近,在机器学习史上,经过培训的预产型变异器(GPTs)已发展到前所未有的规模,这些只受过语言建模目标培训的模型显示,在一些不同的任务中,这些模型表现出了杰出的微小学习能力,然而,除了传闻经验外,对于其多语种能力知之甚少,因为培训前的教材几乎完全由英文文本组成。在这项工作中,我们调查了GPT-3的多语种技能,侧重于在培训前教材中几乎没有出现的一种语言,即加泰罗尼亚语,结果特别有意义;我们假设我们的结果可能与其他语言相关。我们发现,该模型显示了一种杰出的成绩,特别是在基因化任务方面,主要在语言理解任务方面可以预见的局限性,但在零光学问答和自然语言生成方面仍然取得显著成果。我们调查了该模型在采掘问答和自然语言生成方面的潜力和局限性,以及规模的影响。

0

相关内容

语言模型化

语言模型化

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

机器翻译深度学习最新综述

机器翻译深度学习最新综述

专知会员服务

99+阅读 · 2020年2月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Continual Learning in Multilingual NMT via Language-Specific Embeddings

Arxiv

0+阅读 · 2021年10月20日

AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation

Arxiv

0+阅读 · 2021年10月19日

Predicting the Performance of Multilingual NLP Models

Arxiv

0+阅读 · 2021年10月17日

PAGnol: An Extra-Large French Generative Model

Arxiv

0+阅读 · 2021年10月16日

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Extending Machine Language Models toward Human-Level Language Understanding

Extending Machine Language Models toward Human-Level Language Understanding

Arxiv

4+阅读 · 2019年12月12日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

语言模型化

小样本学习

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

机器翻译深度学习最新综述

机器翻译深度学习最新综述

专知会员服务

99+阅读 · 2020年2月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Continual Learning in Multilingual NMT via Language-Specific Embeddings

Arxiv

0+阅读 · 2021年10月20日

AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation

Arxiv

0+阅读 · 2021年10月19日

Predicting the Performance of Multilingual NLP Models

Arxiv

0+阅读 · 2021年10月17日

PAGnol: An Extra-Large French Generative Model

Arxiv

0+阅读 · 2021年10月16日

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Extending Machine Language Models toward Human-Level Language Understanding

Extending Machine Language Models toward Human-Level Language Understanding

Arxiv

4+阅读 · 2019年12月12日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员