低资源神经机翻译成本-效益高的培训 (Cost-Effective Training in Low-Resource Neural Machine Translation) - 专知论文

会员服务 ·

0

NMT · Machine Translation · MoDELS · Performer · BLEU ·

2022 年 1 月 14 日

Cost-Effective Training in Low-Resource Neural Machine Translation

翻译：低资源神经机翻译成本-效益高的培训

Sai Koneru,Danni Liu,Jan Niehues

While Active Learning (AL) techniques are explored in Neural Machine Translation (NMT), only a few works focus on tackling low annotation budgets where a limited number of sentences can get translated. Such situations are especially challenging and can occur for endangered languages with few human annotators or having cost constraints to label large amounts of data. Although AL is shown to be helpful with large budgets, it is not enough to build high-quality translation systems in these low-resource conditions. In this work, we propose a cost-effective training procedure to increase the performance of NMT models utilizing a small number of annotated sentences and dictionary entries. Our method leverages monolingual data with self-supervised objectives and a small-scale, inexpensive dictionary for additional supervision to initialize the NMT model before applying AL. We show that improving the model using a combination of these knowledge sources is essential to exploit AL strategies and increase gains in low-resource conditions. We also present a novel AL strategy inspired by domain adaptation for NMT and show that it is effective for low budgets. We propose a new hybrid data-driven approach, which samples sentences that are diverse from the labelled data and also most similar to unlabelled data. Finally, we show that initializing the NMT model and further using our AL strategy can achieve gains of up to $13$ BLEU compared to conventional AL methods.

翻译：虽然在神经机器翻译(NMT)中探索了积极学习(AL)技术,但只有少数工作侧重于处理低批注预算,因为可以翻译数量有限的刑期,这种情况尤其具有挑战性,而且对于濒危语言而言,只有很少的人类说明员或对大量数据标签有成本限制的濒危语言,这种情况尤其具有挑战性。虽然事实证明AL在大量预算方面很有帮助,但在这些低资源条件下建设高质量的翻译系统是不够的。在这项工作中,我们建议采用成本效益高的培训程序,利用少量附加说明的句子和字典条目提高NMT模型的性能。我们的方法利用单语数据与自我监督目标相结合,并使用小规模、廉价的字典进行额外监督,以启动NMT模型。我们表明,利用这些知识来源的组合来改进模型对于利用AL战略和增加低资源条件下的收益至关重要。我们还提出了一个受NMT的域适应启发的新型AL战略,并表明它对低预算是有效的。我们提出了一个新的混合数据驱动方法,其样本化的句子与标定的模型数据不同,而且与ALMT战略中最相似。最后,我们用了13的常规的AL-IMUA数据进行比较。

0

相关内容

NMT

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

全球首个GNN为主的AI创业公司，募资$18.5 million！

全球首个GNN为主的AI创业公司，募资$18.5 million！

图与推荐

1+阅读 · 2022年4月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

保险金融市场中相依风险模型的随机最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

基于不同风险偏好的民航突发事件应急决策研究

国家自然科学基金

0+阅读 · 2013年12月31日

数学：大有可为

国家自然科学基金

5+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

压缩感知框架下多视光学遥感影像超分辨率重建方法

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

非线性方程中的拓扑与变分方法

国家自然科学基金

1+阅读 · 2011年12月31日

DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine Translation

Arxiv

0+阅读 · 2022年4月20日

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Arxiv

0+阅读 · 2022年4月18日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Improving both domain robustness and domain adaptability in machine translation

Arxiv

0+阅读 · 2022年4月17日

Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost Sensor for Dexterous Manipulation

Arxiv

0+阅读 · 2022年4月15日

Consecutive Decoding for Speech-to-text Translation

Arxiv

0+阅读 · 2022年4月15日

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

Arxiv

0+阅读 · 2022年4月5日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

全球首个GNN为主的AI创业公司，募资$18.5 million！

全球首个GNN为主的AI创业公司，募资$18.5 million！

图与推荐

1+阅读 · 2022年4月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine Translation

Arxiv

0+阅读 · 2022年4月20日

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Arxiv

0+阅读 · 2022年4月18日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Improving both domain robustness and domain adaptability in machine translation

Arxiv

0+阅读 · 2022年4月17日

Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost Sensor for Dexterous Manipulation

Arxiv

0+阅读 · 2022年4月15日

Consecutive Decoding for Speech-to-text Translation

Arxiv

0+阅读 · 2022年4月15日

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

Arxiv

0+阅读 · 2022年4月5日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

相关基金

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

保险金融市场中相依风险模型的随机最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

基于不同风险偏好的民航突发事件应急决策研究

国家自然科学基金

0+阅读 · 2013年12月31日

数学：大有可为

国家自然科学基金

5+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

压缩感知框架下多视光学遥感影像超分辨率重建方法

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

非线性方程中的拓扑与变分方法

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员