BARThez:有技能的预先训练法国序列到序列模型 (BARThez: a Skilled Pretrained French Sequence-to-Sequence Model) - 专知论文

会员服务 ·

0

BART · MoDELS · 归纳迁移学习 · Continuity · 语言模型化 ·

2021 年 2 月 9 日

BARThez: a Skilled Pretrained French Sequence-to-Sequence Model

翻译：BARThez:有技能的预先训练法国序列到序列模型

Moussa Kamal Eddine,Antoine J. -P. Tixier,Michalis Vazirgiannis

from arxiv, More experiments and results, human evaluation, reorganization of paper

Inductive transfer learning has taken the entire NLP field by storm, with models such as BERT and BART setting new state of the art on countless NLU tasks. However, most of the available models and research have been conducted for English. In this work, we introduce BARThez, the first large-scale pretrained seq2seq model for French. Being based on BART, BARThez is particularly well-suited for generative tasks. We evaluate BARThez on five discriminative tasks from the FLUE benchmark and two generative tasks from a novel summarization dataset, OrangeSum, that we created for this research. We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT. We also continue the pretraining of a multilingual BART on BARThez' corpus, and show our resulting model, mBARThez, to significantly boost BARThez' generative performance. Code, data and models are publicly available.

翻译：感应转移学习通过风暴将整个NLP领域推向了风暴,模型如BERT和BART为无数NLU任务确立了新的艺术状态。然而,大多数现有模型和研究都是为英语进行的。在这项工作中,我们为法语引入了第一个大型的预先培训的后继2等模型BARThez。基于BART,BARThez特别适合基因化任务。我们评估了五个来自FLUE基准的差别性任务BARThez,以及我们为这项研究创建的新型合成数据集OrangeSum的两件基因化任务。我们显示BARThez与基于最新技术的BERT的法语模型,如CamemBERT和FlauBERT非常有竞争力。我们还继续对BART进行多语种的BART进行关于BART的预培训,并展示我们由此产生的模型MBARthez,以显著提升BARThez的基因化表现。代码、数据和模型是公开的。

0

相关内容

BART

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

已删除

将门创投

3+阅读 · 2019年1月15日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Learning Deep Transformer Models for Machine Translation

Learning Deep Transformer Models for Machine Translation

Arxiv

3+阅读 · 2019年6月5日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

Investigations on Knowledge Base Embedding for Relation Prediction and Extraction

Arxiv

8+阅读 · 2018年2月6日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

Fine-tuned Language Models for Text Classification

Arxiv

5+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

归纳迁移学习

语言模型化

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

已删除

将门创投

3+阅读 · 2019年1月15日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

相关论文

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Learning Deep Transformer Models for Machine Translation

Learning Deep Transformer Models for Machine Translation

Arxiv

3+阅读 · 2019年6月5日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

Investigations on Knowledge Base Embedding for Relation Prediction and Extraction

Arxiv

8+阅读 · 2018年2月6日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

Fine-tuned Language Models for Text Classification

Arxiv

5+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员