深编码器, 浅光解码器: 重新评价非航空机器翻译 (Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation)

Much recent effort has been invested in non-autoregressive neural machine translation, which appears to be an efficient alternative to state-of-the-art autoregressive machine translation on modern GPUs. In contrast to the latter, where generation is sequential, the former allows generation to be parallelized across target token positions. Some of the latest non-autoregressive models have achieved impressive translation quality-speed tradeoffs compared to autoregressive baselines. In this work, we reexamine this tradeoff and argue that autoregressive baselines can be substantially sped up without loss in accuracy. Specifically, we study autoregressive models with encoders and decoders of varied depths. Our extensive experiments show that given a sufficiently deep encoder, a single-layer autoregressive decoder can substantially outperform strong non-autoregressive models with comparable inference speed. We show that the speed disadvantage for autoregressive baselines compared to non-autoregressive methods has been overestimated in three aspects: suboptimal layer allocation, insufficient speed measurement, and lack of knowledge distillation. Our results establish a new protocol for future research toward fast, accurate machine translation. Our code is available at https://github.com/jungokasai/deep-shallow.

翻译：最近,我们投入了大量精力,在非自动侵蚀神经机器翻译方面进行了大量努力,这似乎是现代GPU上最先进的自动递减机器翻译的一种有效替代物。与现代GPU相比,现代GPU上最先进的自动递减机器翻译是一种高效的替代物。与现代GPU上最先进的自动递减机器翻译相比,前者允许在目标符号位置上将一代相平行。与自动递减基线相比,一些最新的非自动递减模型已经实现了令人印象深刻的翻译质量-速度权衡。在这项工作中,我们重新审查了这一权衡,并争论说,自动递减基线在三个方面可以大大加速,而不会丧失准确性。具体地说,我们用不同深度的编码和分解器研究自动递增模型研究自动递增模型。我们的广泛实验显示,给一个足够深的编码,一个单层自动递减式的自动递减解变码可以大大超过具有可比性的强大非递增性模型,且具有可比性的推移速度。我们显示,自动递增基线相对于非递增方法的速度劣势的基线在三个方面被过高地估计:次层分配、速度不足、速度测量测量、速度测量、缺乏、缺少和深层测量,以及缺乏精密。我们的数据正在建立新的程序。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

专知会员服务

81+阅读 · 2020年5月20日

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

专知会员服务

41+阅读 · 2020年5月13日

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日