配有平行自动递后调整的不偏向端对端发言翻译 (Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring) - 专知论文

会员服务 ·

0

语音翻译 · 解码 · CMLM · Conformer · MoDELS ·

2021 年 9 月 9 日

Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring

翻译：配有平行自动递后调整的不偏向端对端发言翻译

Hirofumi Inaguma,Yosuke Higuchi,Kevin Duh,Tatsuya Kawahara,Shinji Watanabe

This article describes an efficient end-to-end speech translation (E2E-ST) framework based on non-autoregressive (NAR) models. End-to-end speech translation models have several advantages over traditional cascade systems such as inference latency reduction. However, conventional AR decoding methods are not fast enough because each token is generated incrementally. NAR models, however, can accelerate the decoding speed by generating multiple tokens in parallel on the basis of the token-wise conditional independence assumption. We propose a unified NAR E2E-ST framework called Orthros, which has an NAR decoder and an auxiliary shallow AR decoder on top of the shared encoder. The auxiliary shallow AR decoder selects the best hypothesis by rescoring multiple candidates generated from the NAR decoder in parallel (parallel AR rescoring). We adopt conditional masked language model (CMLM) and a connectionist temporal classification (CTC)-based model as NAR decoders for Orthros, referred to as Orthros-CMLM and Orthros-CTC, respectively. We also propose two training methods to enhance the CMLM decoder. Experimental evaluations on three benchmark datasets with six language directions demonstrated that Orthros achieved large improvements in translation quality with a very small overhead compared with the baseline NAR model. Moreover, the Conformer encoder architecture enabled large quality improvements, especially for CTC-based models. Orthros-CTC with the Conformer encoder increased decoding speed by 3.63x on CPU with translation quality comparable to that of an AR model.

翻译：本篇文章描述基于非自动递增模式的高效端到端语音翻译(E2E-ST)框架。端到端语音翻译模型比传统的级联系统( 如导引延延缩缩放) 有几个优势。但是, 常规AR解码方法不够快, 因为每个符号都是递增生成的。但是, NAR 模型可以在象征性有条件独立假设的基础上同时生成多个符号, 从而加速解码速度。我们提议一个统一的NAR E2E- ST框架, 称为Orthros, 在共享编码器的顶端有一个NAR解码器和一个辅助浅度的AR解码器。辅助浅度 AR 解码器选择了最好的假设, 重新校验同时生成的NAR解码器产生的多个候选人( parllel AR 重校验) 。我们采用有条件的遮码语言模型( CMM ) 和基于连接时间的模型( Cros) 以NAR deco 基础模型为基础, 称为Orth- LM 和 Orros 快速解译的升级模型, 和 Cral 分别提出C- cal 的升级的升级。我们还提议用大规模的两种方法, 在大规模的升级的升级的模型上,, 和大规模的升级的升级的解算,,,, 和大规模的解译程的解码的解码,, 的解码的解码的解码结构将提升了C- 。

0

相关内容

语音翻译

通过计算机进行不同语言之间的直接语音翻译，辅助不同语言背景的人们进行沟通已经成为世界各国研究的重点。和一般的文本翻译不同，语音翻译需要把语音识别、机器翻译和语音合成三大技术进行集成，具有很大的挑战性。

【UBC】高级机器学习课程，Advanced Machine Learning

【UBC】高级机器学习课程，Advanced Machine Learning

专知会员服务

26+阅读 · 2021年1月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

【泡泡一分钟】一种基于光场的快速有效深度图估计方法（3dv-43）

【泡泡一分钟】一种基于光场的快速有效深度图估计方法（3dv-43）

泡泡机器人SLAM

4+阅读 · 2018年2月11日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition

Arxiv

0+阅读 · 2021年10月29日

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Arxiv

0+阅读 · 2021年10月27日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

3+阅读 · 2020年6月9日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Arxiv

6+阅读 · 2018年5月29日

Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

Arxiv

4+阅读 · 2018年2月19日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

相关VIP内容

【UBC】高级机器学习课程，Advanced Machine Learning

【UBC】高级机器学习课程，Advanced Machine Learning

专知会员服务

26+阅读 · 2021年1月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

【泡泡一分钟】一种基于光场的快速有效深度图估计方法（3dv-43）

【泡泡一分钟】一种基于光场的快速有效深度图估计方法（3dv-43）

泡泡机器人SLAM

4+阅读 · 2018年2月11日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition

Arxiv

0+阅读 · 2021年10月29日

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Arxiv

0+阅读 · 2021年10月27日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

3+阅读 · 2020年6月9日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Arxiv

6+阅读 · 2018年5月29日

Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

Arxiv

4+阅读 · 2018年2月19日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员