多语言神经机器翻译帕累托前沿研究 (On the Pareto Front of Multilingual Neural Machine Translation) - 专知论文

会员服务 ·

0

帕累托前沿 · 多语言神经机器翻译 · 神经机器翻译 · 幂律 · 机器翻译 ·

2023 年 4 月 6 日

On the Pareto Front of Multilingual Neural Machine Translation

翻译：多语言神经机器翻译帕累托前沿研究

Liang Chen,Shuming Ma,Dongdong Zhang,Furu Wei,Baobao Chang

from arxiv, 14 pages, 6 figures, code released at https://github.com/chenllliang/ParetoMNMT

In this work, we study how the generalization performance of a given direction changes with its sampling ratio in Multilingual Neural Machine Translation (MNMT). By training over 200 multilingual models with various model sizes, directions, and total numbers of tasks, we find that scalarization leads to a multitask trade-off front that deviates from the traditional Pareto front when there exists data imbalance in the training corpus. That is, the performance of certain translation directions does not improve with the increase of its weight in the multi-task optimization objective, which poses greater challenge to improve the overall performance of all directions. Based on our observations, we propose the Double Power Law to predict the unique performance trade-off front in MNMT, which is robust across various languages, data adequacy and number of tasks. Finally, we formulate sample ratio selection in MNMT as an optimization problem based on the Double Power Law, which achieves better performance than temperature searching and gradient manipulation methods using up to half of the total training budget in our experiments.

翻译：在这项工作中，我们研究了给定方向的广义性能如何随其采样比例在多语言神经机器翻译（MNMT）中改变。通过训练超过200个具有各种模型大小，方向和任务总数的多语言模型，我们发现当训练语料库中存在数据不平衡时，标量化会导致多任务折衷前沿偏离传统的帕累托前沿。也就是说，对于某些翻译方向的性能不随其在多任务优化目标中的权重增加而提高，这对于提高所有方向的整体性能提出了更大的挑战。基于我们的观察，我们提出了双幂律以预测MNMT中的独特性能折衷前沿，该前沿在各种语言，数据充足性和任务数量方面都具有鲁棒性。最后，我们将MNMT中的采样比例选择建模为基于双幂律的优化问题，在我们的实验中使用的训练预算的一半以内，其性能优于温度搜索和梯度操作方法。

0

相关内容

帕累托前沿

帕累托前沿

【ICDM2022教程】多目标优化与推荐，173页ppt

【ICDM2022教程】多目标优化与推荐，173页ppt

专知会员服务

46+阅读 · 2022年12月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

专知会员服务

84+阅读 · 2020年4月11日

【上海交大-字节跳动】在神经机器翻译中充分利用BERT，Making the Most of BERT in NMT

【上海交大-字节跳动】在神经机器翻译中充分利用BERT，Making the Most of BERT in NMT

专知会员服务

24+阅读 · 2020年3月28日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

双层界面膜成膜添加剂的设计与成膜机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

多目标复杂车辆路径问题中的模因优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向复杂应用环境的传感网节点自适应定位方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于智能体多目标进化的应急资源调度模型与算法

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1病毒被有效抑制后不同药物选择压力下HIV-1病毒env基因准种的变化规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多特征融合与多级多模式分类的人体动作识别技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

拟南芥非编码RNA在光形态建成中的作用与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

催化型氮杂Wittig反应合成多取代杂环的新方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于图的统计机器翻译方法研究

国家自然科学基金

2+阅读 · 2010年12月31日

脂类组学解析植物膜脂分子组成及磷脂酶D对gamma辐射的响应

国家自然科学基金

0+阅读 · 2008年12月31日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Arxiv

0+阅读 · 2023年5月19日

Evaluating task understanding through multilingual consistency: A ChatGPT case study

Arxiv

0+阅读 · 2023年5月19日

Applying Ising Machines to Multi-objective QUBOs

Arxiv

0+阅读 · 2023年5月19日

Causes and Cures for Interference in Multilingual Translation

Arxiv

0+阅读 · 2023年5月19日

On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

Arxiv

0+阅读 · 2023年5月18日

Unified Model Learning for Various Neural Machine Translation

Arxiv

0+阅读 · 2023年5月18日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

18+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

帕累托前沿

多语言神经机器翻译

神经机器翻译

相关VIP内容

【ICDM2022教程】多目标优化与推荐，173页ppt

【ICDM2022教程】多目标优化与推荐，173页ppt

专知会员服务

46+阅读 · 2022年12月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

专知会员服务

84+阅读 · 2020年4月11日

【上海交大-字节跳动】在神经机器翻译中充分利用BERT，Making the Most of BERT in NMT

【上海交大-字节跳动】在神经机器翻译中充分利用BERT，Making the Most of BERT in NMT

专知会员服务

24+阅读 · 2020年3月28日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

战略安全时间滞后：人工通用智能（AGI）驱动的CPA（成本-概率不对称性）逆转在美国反导中的应用（2027–2032） | 最新文献

《科研智能：人工智能赋能工业仿真研究报告（2025年）》

无人机（UAV）战略：区域大国与暴力非国家行为体在中东冲突中对无人机的运用 | 130页

【NeurIPS2025】迈向开放世界的三维“物体性”学习

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Arxiv

0+阅读 · 2023年5月19日

Evaluating task understanding through multilingual consistency: A ChatGPT case study

Arxiv

0+阅读 · 2023年5月19日

Applying Ising Machines to Multi-objective QUBOs

Arxiv

0+阅读 · 2023年5月19日

Causes and Cures for Interference in Multilingual Translation

Arxiv

0+阅读 · 2023年5月19日

On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

Arxiv

0+阅读 · 2023年5月18日

Unified Model Learning for Various Neural Machine Translation

Arxiv

0+阅读 · 2023年5月18日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

18+阅读 · 2018年6月1日

相关基金

双层界面膜成膜添加剂的设计与成膜机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

多目标复杂车辆路径问题中的模因优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向复杂应用环境的传感网节点自适应定位方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于智能体多目标进化的应急资源调度模型与算法

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1病毒被有效抑制后不同药物选择压力下HIV-1病毒env基因准种的变化规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多特征融合与多级多模式分类的人体动作识别技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

拟南芥非编码RNA在光形态建成中的作用与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

催化型氮杂Wittig反应合成多取代杂环的新方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于图的统计机器翻译方法研究

国家自然科学基金

2+阅读 · 2010年12月31日

脂类组学解析植物膜脂分子组成及磷脂酶D对gamma辐射的响应

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员