双纸端对端 ASR 模型压缩 (Two-Pass End-to-End ASR Model Compression)

Speech recognition on smart devices is challenging owing to the small memory footprint. Hence small size ASR models are desirable. With the use of popular transducer-based models, it has become possible to practically deploy streaming speech recognition models on small devices [1]. Recently, the two-pass model [2] combining RNN-T and LAS modules has shown exceptional performance for streaming on-device speech recognition. In this work, we propose a simple and effective approach to reduce the size of the two-pass model for memory-constrained devices. We employ a popular knowledge distillation approach in three stages using the Teacher-Student training technique. In the first stage, we use a trained RNN-T model as a teacher model and perform knowledge distillation to train the student RNN-T model. The second stage uses the shared encoder and trains a LAS rescorer for student model using the trained RNN-T+LAS teacher model. Finally, we perform deep-finetuning for the student model with a shared RNN-T encoder, RNN-T decoder, and LAS rescorer. Our experimental results on standard LibriSpeech dataset show that our system can achieve a high compression rate of 55% without significant degradation in the WER compared to the two-pass teacher model.

翻译：由于记忆力小,智能装置的语音识别具有挑战性,因为记忆力小,因此小型的ASR模式是可取的。由于使用流行的基于导师的模型,因此有可能在小型装置上实际部署流言识别模型[1]。最近,将RNN-T和LAS模块相结合的双通模式[2]最近,将RNN-T和LAS模块组合起来的双通模式[2]展示了在线语音识别的特殊性能。在这项工作中,我们提出了一个简单而有效的方法,以缩小记忆力控制装置双通模式的大小。我们使用师资培训技术,在三个阶段采用流行的知识蒸馏方法。在第一阶段,我们使用训练有素的RNNNT-T模型作为教师模型,并进行知识蒸馏,以培训学生RNNNT-T模型。第二阶段使用共享的编码,并培训学生模型的LASRector。最后,我们用一个共享的 RNNNT-T encoder 和 LAS Rescorer 模型,在三个阶段使用共同的RNNNNNNNT 和LS-T 师的共享的学习模型进行深丝模型进行深调。我们的实验性测试结果,在不高标准系统上,在不进行高压下,在高压中,在高压中,在高压下,在高压下,在高压下,我们标准的LS-LS-R-LS-LS-LS-RS-S 上,在高压中,在高压中,在高压中进行。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日