用于内置神经机器翻译的动态多层系统 (Dynamic Multi-Branch Layers for On-Device Neural Machine Translation)

With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications such as neural machine translation (NMT) from cloud to mobile devices such as smartphones. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.6 times faster with the same number of parameters.

翻译：随着人工智能的迅速发展(AI),出现了将神经机翻译(NMT)等神经机翻译(NMT)等光学应用从云层转移到智能手机等移动装置的趋势。受有限的硬件资源和电池的制约,NMT系统在设备上的表现远不能令人满意。在有条件的计算激励下,我们提议改进具有动态多分层的在设备上安装NMT系统的性能。具体地说,我们设计了一个从层到层的动态多部门网络,在培训和推断期间只有一个分支被激活。由于并非所有分支在培训期间被激活,我们提议对每个分支进行共用的私人重新校准,以确保足够的培训。在几乎相同的计算成本下,我们的方法在WMT14英语-德语翻译任务上实现了高达1.7个BLEU点的改进,在变压器模型中WMT20中英语翻译任务上实现了1.8个BLEU点的改进。与使用多个分支的强基线相比,拟议方法的速度为1.6倍,参数数量相同。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【ICML 2020】设置LayerNorm使Transformer加速收敛

专知会员服务

16+阅读 · 2020年7月27日

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

专知会员服务

55+阅读 · 2020年4月15日

【Google大脑】进化正则激活层，Evolving Normalization-Activation Layers

专知会员服务

19+阅读 · 2020年4月9日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日