改进土著语言的神经机器翻译,采用多语言传输学习 (Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning) - 专知论文

会员服务 ·

0

Machine Translation · Performer · 迁移学习 · 学成 · 情景 ·

2022 年 5 月 14 日

Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning

翻译：改进土著语言的神经机器翻译,采用多语言传输学习

Wei-Rui Chen,Muhammad Abdul-Mageed

Machine translation (MT) involving Indigenous languages, including those possibly endangered, is challenging due to lack of sufficient parallel data. We describe an approach exploiting bilingual and multilingual pretrained MT models in a transfer learning setting to translate from Spanish to ten South American Indigenous languages. Our models set new SOTA on five out of the ten language pairs we consider, even doubling performance on one of these five pairs. Unlike previous SOTA that perform data augmentation to enlarge the train sets, we retain the low-resource setting to test the effectiveness of our models under such a constraint. In spite of the rarity of linguistic information available about the Indigenous languages, we offer a number of quantitative and qualitative analyses (e.g., as to morphology, tokenization, and orthography) to contextualize our results.

翻译：由于缺乏足够的平行数据,涉及土著语言(包括可能濒危语言)的机器翻译(MT)具有挑战性。我们描述了一种在从西班牙语到南美10种土著语言的传输学习环境中利用双语和多语言预先培训的MT模式的方法。我们的模型为我们考虑的10对语言中的5对设定了新的SOTA,甚至将这5对语言中的1对的功能翻一番。与以前为扩大火车组而进行数据扩增的SOTA不同的是,我们保留了低资源环境,在这种限制下检验我们模型的有效性。尽管关于土著语言的现有语言信息很少,但我们提供了一些定量和定性分析(例如形态学、象征性化和笔记),以将我们的结果背景化。

0

相关内容

Machine Translation

Machine Translation

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

硼磷酸盐无铍深紫外非线性光学材料的设计与制备

国家自然科学基金

0+阅读 · 2016年12月31日

对小分子化合物vacuolin-1抗侵袭/迁移的作用机制及其抗肿瘤转移能力的研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

含高渗透率风电的电力系统多元储能电源优化配置研究

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

全氟（多氟）烷基化合物（PFASs）在全球海洋中的分布与长距离传输机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

含氮化合物中红色长余辉发光材料的探索

国家自然科学基金

0+阅读 · 2011年12月31日

HDACi多酸前导化合物的设计合成和抗肿瘤活性

国家自然科学基金

0+阅读 · 2008年12月31日

A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning

Arxiv

0+阅读 · 2022年7月5日

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

Arxiv

0+阅读 · 2022年7月3日

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Arxiv

0+阅读 · 2022年7月1日

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Arxiv

0+阅读 · 2022年7月1日

MMFN: Multi-Modal-Fusion-Net for End-to-End Driving

Arxiv

0+阅读 · 2022年7月1日

On the Impact of Noises in Crowd-Sourced Data for Speech Translation

Arxiv

0+阅读 · 2022年7月1日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Arxiv

14+阅读 · 2020年3月10日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

Matching Networks for One Shot Learning

Arxiv

10+阅读 · 2017年12月29日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning

Arxiv

0+阅读 · 2022年7月5日

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

Arxiv

0+阅读 · 2022年7月3日

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Arxiv

0+阅读 · 2022年7月1日

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Arxiv

0+阅读 · 2022年7月1日

MMFN: Multi-Modal-Fusion-Net for End-to-End Driving

Arxiv

0+阅读 · 2022年7月1日

On the Impact of Noises in Crowd-Sourced Data for Speech Translation

Arxiv

0+阅读 · 2022年7月1日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Arxiv

14+阅读 · 2020年3月10日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

Matching Networks for One Shot Learning

Arxiv

10+阅读 · 2017年12月29日

相关基金

硼磷酸盐无铍深紫外非线性光学材料的设计与制备

国家自然科学基金

0+阅读 · 2016年12月31日

对小分子化合物vacuolin-1抗侵袭/迁移的作用机制及其抗肿瘤转移能力的研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

含高渗透率风电的电力系统多元储能电源优化配置研究

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

全氟（多氟）烷基化合物（PFASs）在全球海洋中的分布与长距离传输机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

含氮化合物中红色长余辉发光材料的探索

国家自然科学基金

0+阅读 · 2011年12月31日

HDACi多酸前导化合物的设计合成和抗肿瘤活性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员