4D ASR: 四氯化碳、注意、传送器和面具制造分解器联合示范 (4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders)

The network architecture of end-to-end (E2E) automatic speech recognition (ASR) can be classified into several models, including connectionist temporal classification (CTC), recurrent neural network transducer (RNN-T), attention mechanism, and non-autoregressive mask-predict models. Since each of these network architectures has pros and cons, a typical use case is to switch these separate models depending on the application requirement, resulting in the increased overhead of maintaining all models. Several methods for integrating two of these complementary models to mitigate the overhead issue have been proposed; however, if we integrate more models, we will further benefit from these complementary models and realize broader applications with a single system. This paper proposes four-decoder joint modeling (4D) of CTC, attention, RNN-T, and mask-predict, which has the following three advantages: 1) The four decoders are jointly trained so that they can be easily switched depending on the application scenarios. 2) Joint training may bring model regularization and improve the model robustness thanks to their complementary properties. 3) Novel one-pass joint decoding methods using CTC, attention, and RNN-T further improves the performance. The experimental results showed that the proposed model consistently reduced the WER.

翻译：终端到终端自动语音识别(E2E)的网络架构(ASR)可分为若干模式,包括连接时间分类(CTC)、神经网络经常性传输器(RNN-T)、关注机制和非自动递化面罩预测模型。由于这些网络架构各有利弊,一个典型的使用案例是根据应用要求转换这些不同的模型,从而导致维护所有模型的间接费用增加。提出了几种整合这些互补模型以缓解间接费用问题的两种补充模型的方法;但是,如果我们纳入更多的模型,我们将进一步受益于这些互补模型,并在单一系统中实现更广泛的应用。本文建议四解码联合模型(4D)的CTC、注意、RNNNT和遮蔽预设,这有以下三个好处:(1) 4个解码器经过联合培训,以便根据应用情景很容易转换。(2) 联合培训可能带来模式的正规化,并改进模型的稳健性,因为其互补性质。(3) 诺维尔一面联合解码模型将进一步使用CTS、注意和RNNT的拟议实验结果不断改进。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ACML2020】张量网络机器学习:最近的进展和前沿，109页ppt

专知会员服务

55+阅读 · 2020年12月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日