加速自动递减翻译,并采用普遍侵略下沉化法 (Lossless Speedup of Autoregressive Translation with Generalized Aggressive Decoding) - 专知论文

会员服务 ·

0

解码 · 贪心 · 词元分析器 · 束搜索 · 相同 ·

2022 年 5 月 20 日

Lossless Speedup of Autoregressive Translation with Generalized Aggressive Decoding

翻译：加速自动递减翻译,并采用普遍侵略下沉化法

Heming Xia,Tao Ge,Furu Wei,Zhifang Sui

from arxiv, Work in progress

Different from previous work accelerating translation at the cost of quality loss, we propose Generalized Aggressive Decoding (GAD) -- a novel decoding paradigm for lossless speedup of autoregressive translation, through the collaboration of autoregressive and non-autoregressive translation (NAT) of the Transformer. At each decoding iteration, GAD aggressively decodes a number of tokens with NAT as a draft and then verifies them in the autoregressive manner, where only the tokens that pass the verification are accepted as decoded tokens. GAD can achieve the same results as autoregressive translation but much more efficiently because both NAT drafting and autoregressive verification compute in parallel. We conduct experiments in four standard WMT benchmarks and confirm that the vanilla GAD yields exactly the same results as greedy decoding with an around $3\times$ speedup, and that its variant (GAD++) with an advanced verification strategy not only outperforms the greedy translation and even achieves the comparable translation quality with the beam search result, but also further improves the decoding speed, resulting in an around $5\times$ speedup over autoregressive translation. Moreover, GAD can be easily generalized for lossless speedup of other seq2seq tasks like Abstractive Summarization, and benefit more from stronger computing devices, demonstrating its potential to become a de facto decoding paradigm in the future. Our models and codes are available at https://github.com/hemingkx/GAD.

翻译：与先前以质量损失代价加速翻译的工作不同,我们提议通用递增解码(GAD) -- -- 通过与变异器自动递增和非自动递增翻译(NAT)合作,通过变异器自动递增和非自动递增翻译(NAT),为自动递减加速翻译(GAD)提供新的解码模式。在每次解码迭代法时,GAD会积极用NAT来解码一些标记,然后以自动递增方式进行核实,只有通过核查的标记才能被接受为解码代号。GAD可以实现自动递增翻译的同样结果,但效率更高得多,因为NAT同时进行自动递增翻译和自动递增核查(NAT),我们用四个标准的WMT基准进行实验,并证实Vanilla GADAD产生与贪婪解码完全相同的结果,大约3美元快速解译,而它的变式(GAD+B)不仅超越了贪婪翻译,甚至实现了比比平价的翻译质量质量,而且更容易地展示了我们的搜索结果,而且更快速地展示了SUDADADADAD

0

相关内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

非线性移位寄存器与序列的频谱分析

国家自然科学基金

0+阅读 · 2014年12月31日

脂代谢中微效多基因协同作用致遗传性高胆固醇血症的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

功能梯度板、壳和压电材料裂纹问题的Williams型解及数值模拟

国家自然科学基金

0+阅读 · 2011年12月31日

基于Compressive sensing理论的单探测器太赫兹成像技术

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于认知无线电的超宽带卫星通信脉冲波形设计方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis

Arxiv

0+阅读 · 2022年7月8日

Stability of Aggregation Graph Neural Networks

Arxiv

0+阅读 · 2022年7月8日

Run Time Analysis for Random Local Search on Generalized Majority Functions

Arxiv

0+阅读 · 2022年7月7日

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

Arxiv

0+阅读 · 2022年7月6日

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Arxiv

0+阅读 · 2022年7月6日

Fast Density Estimation for Density-based Clustering Methods

Arxiv

0+阅读 · 2022年7月6日

Code Translation with Compiler Representations

Arxiv

0+阅读 · 2022年6月30日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Cross-Domain Image Matching with Deep Feature Maps

Arxiv

14+阅读 · 2018年4月6日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis

Arxiv

0+阅读 · 2022年7月8日

Stability of Aggregation Graph Neural Networks

Arxiv

0+阅读 · 2022年7月8日

Run Time Analysis for Random Local Search on Generalized Majority Functions

Arxiv

0+阅读 · 2022年7月7日

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

Arxiv

0+阅读 · 2022年7月6日

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Arxiv

0+阅读 · 2022年7月6日

Fast Density Estimation for Density-based Clustering Methods

Arxiv

0+阅读 · 2022年7月6日

Code Translation with Compiler Representations

Arxiv

0+阅读 · 2022年6月30日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Cross-Domain Image Matching with Deep Feature Maps

Arxiv

14+阅读 · 2018年4月6日

相关基金

非线性移位寄存器与序列的频谱分析

国家自然科学基金

0+阅读 · 2014年12月31日

脂代谢中微效多基因协同作用致遗传性高胆固醇血症的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

功能梯度板、壳和压电材料裂纹问题的Williams型解及数值模拟

国家自然科学基金

0+阅读 · 2011年12月31日

基于Compressive sensing理论的单探测器太赫兹成像技术

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于认知无线电的超宽带卫星通信脉冲波形设计方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员