多语种神经机器翻译,有深编码器和多浅代碼器 (Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders) - 专知论文

会员服务 ·

0

解码 · Machine Translation · Transformer模型 · 可约的 · MoDELS ·

2022 年 6 月 5 日

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

翻译：多语种神经机器翻译,有深编码器和多浅代碼器

Xiang Kong,Adithya Renduchintala,James Cross,Yuqing Tang,Jiatao Gu,Xian Li

from arxiv, EACL 2021

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity. However, the extra latency and memory costs introduced by this approach may make it unacceptable for efficiency-constrained applications. It has recently been shown for bilingual translation that using a deep encoder and shallow decoder (DESD) can reduce inference latency while maintaining translation quality, so we study similar speed-accuracy trade-offs for multilingual translation. We find that for many-to-one translation we can indeed increase decoder speed without sacrificing quality using this approach, but for one-to-many translation, shallow decoders cause a clear quality drop. To ameliorate this drop, we propose a deep encoder with multiple shallow decoders (DEMSD) where each shallow decoder is responsible for a disjoint subset of target languages. Specifically, the DEMSD model with 2-layer decoders is able to obtain a 1.8x speedup on average compared to a standard transformer model with no drop in translation quality.

翻译：最近在多语种翻译工作中,利用更深的变压器模型,提高了能力,提高了翻译质量,超过了双语基线,多语种翻译质量,超过了双语基线。然而,这一方法带来的额外延迟和记忆成本可能使效率受限制的应用无法接受。最近为双语翻译显示,使用深的编码器和浅解码器(DESD)可以降低推断延迟度,同时保持翻译质量,因此我们研究多语种翻译的类似速度-准确性取舍。我们发现,对于许多一对一的翻译来说,我们确实可以提高解码速度,而不会以这种方法牺牲质量,但是对于一对一的翻译来说,浅解码器会造成明显质量下降。为了改善这一下降,我们建议用多层浅解码器(DEDCD)来深度编码器,让每个浅解码器负责一组目标语言脱钩。具体地说,配有两层解码器的DEDCD模型能够平均获得1.8x速度,而标准变换器模型的翻译质量没有下降。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

五脏温阳化瘀汤对PI3K/Akt-mTOR通路介导动脉粥样硬化型血管性痴呆自噬与凋亡的效应机制

国家自然科学基金

0+阅读 · 2014年12月31日

CuS/ZnS/s-g-C3N4异质结分级结构的构筑、生长机理及光催化性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖皮质激素抵抗介导的小胶质细胞过度激活在脑外伤后CIRCI发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Au@CuInSe2等金属-半导体纳米复合结构的制备与光电性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

心脏植入电子装置早期感染的诊断研究

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

特异性调控炎症始动受体介导caspase凋亡通路与急性胰腺炎

国家自然科学基金

0+阅读 · 2009年12月31日

Pretraining a Neural Network before Knowing Its Architecture

Arxiv

0+阅读 · 2022年7月20日

StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning

Arxiv

0+阅读 · 2022年7月20日

Visual Representation Learning with Transformer: A Sequence-to-Sequence Perspective

Arxiv

0+阅读 · 2022年7月19日

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Arxiv

0+阅读 · 2022年7月18日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

Machine Translation

Transformer模型

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Pretraining a Neural Network before Knowing Its Architecture

Arxiv

0+阅读 · 2022年7月20日

StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning

Arxiv

0+阅读 · 2022年7月20日

Visual Representation Learning with Transformer: A Sequence-to-Sequence Perspective

Arxiv

0+阅读 · 2022年7月19日

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Arxiv

0+阅读 · 2022年7月18日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

五脏温阳化瘀汤对PI3K/Akt-mTOR通路介导动脉粥样硬化型血管性痴呆自噬与凋亡的效应机制

国家自然科学基金

0+阅读 · 2014年12月31日

CuS/ZnS/s-g-C3N4异质结分级结构的构筑、生长机理及光催化性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖皮质激素抵抗介导的小胶质细胞过度激活在脑外伤后CIRCI发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Au@CuInSe2等金属-半导体纳米复合结构的制备与光电性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

心脏植入电子装置早期感染的诊断研究

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

特异性调控炎症始动受体介导caspase凋亡通路与急性胰腺炎

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员