Neural network pruning can be effectively applied to compress automatic speech recognition (ASR) models. However, in multilingual ASR, performing language-agnostic pruning may lead to severe performance degradation on some languages because language-agnostic pruning masks may not fit all languages and discard important language-specific parameters. In this work, we present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways"), such that the parameters for each language are learned explicitly. With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower resource languages via joint multilingual training. We propose a novel algorithm to learn ASR pathways, and evaluate the proposed method on 4 languages with a streaming RNN-T model. Our proposed ASR pathways outperform both dense models (-5.0% average WER) and a language-agnostically pruned model (-21.4% average WER), and provide better performance on low-resource languages compared to the monolingual sparse models.
翻译:神经网络运行可以有效地应用于压缩自动语音识别(ASR)模型。 但是,在多语言的ASR中,进行语言-不可知性剪裁可能会导致某些语言的性能严重退化,因为语言-不可知性剪裁面罩可能不适应所有语言,并抛弃重要的语言特有参数。在这项工作中,我们介绍了ASR路径,即一种稀疏的多语种ASR模式,可以激活语言专用子网络(“路径”),这样可以明确了解每种语言的参数。在重叠的子网络中,共享参数还可以通过联合多语种培训为较低资源语言提供知识转让。我们提出了一种新式算法,学习ASR路径,并用流流式RNN-T模型评价4种语言的拟议方法。我们提议的ASR路径比密集模式(-5.0%平均WER)和语言-敏感型小网络模式(21.4%平均WER)都优于单一语言稀有模式,并且能够更好地表现低资源语言。