Code-switching deals with alternative languages in communication process. Training end-to-end (E2E) automatic speech recognition (ASR) systems for code-switching is especially challenging as code-switching training data are always insufficient to combat the increased multilingual context confusion due to the presence of more than one language. We propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model based on the Equivalence Constraint (EC) Theory. The linguistics theory requires that any monolingual fragment that occurs in the code-switching sentence must occur in one of the monolingual sentences. The theory establishes a bridge between monolingual data and code-switching data. We leverage this linguistics theory to design the code-switching E2E ASR model. The proposed model efficiently transfers language knowledge from rich monolingual data to improve the performance of the code-switching ASR model. We evaluate our model on ASRU 2019 Mandarin-English code-switching challenge dataset. Compared to the baseline model, our proposed model achieves a 17.12% relative error reduction.
翻译:在交流过程中,对代码转换的终端到终端自动语音识别(ASR)系统的培训尤其具有挑战性,因为代码转换培训数据总是不足以消除由于多种语言的存在而增加的多语种背景混乱。我们提议一个语言相关关注机制,以减少E2E代码转换 ASR模型在语言控制(EC)理论基础上的多语种背景混乱。语言理论要求代码转换句中出现的任何单语片段都必须在单语句中发生。该理论在单语句中建立了单语数据和代码转换数据之间的桥梁。我们利用这一语言理论设计代码转换 E2E ASR模型。拟议模型有效地将丰富的单一语言数据传授语言知识,以改善代码转换 ASR模型的性能。我们评估了我们关于 ASRU 2019 曼达林-英语代码转换挑战数据集的模型。与基线模型相比,我们提议的模型实现了17.12%的相对减少错误。