Code-switching is about dealing with alternative languages in the communication process. Training end-to-end (E2E) automatic speech recognition (ASR) systems for code-switching is known to be a challenging problem because of the lack of data compounded by the increased language context confusion due to the presence of more than one language. In this paper, we propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model based on the Equivalence Constraint Theory (EC). The linguistic theory requires that any monolingual fragment that occurs in the code-switching sentence must occur in one of the monolingual sentences. It establishes a bridge between monolingual data and code-switching data. By calculating the respective attention of multiple languages, our method can efficiently transfer language knowledge from rich monolingual data. We evaluate our method on ASRU 2019 Mandarin-English code-switching challenge dataset. Compared with the baseline model, the proposed method achieves 11.37% relative mix error rate reduction.
翻译:代码转换是指在沟通过程中处理替代语言。 用于代码转换的终端到终端自动语音识别系统(E2E)自动语音识别系统(ASR)已知是一个具有挑战性的问题,因为缺少数据,而且由于多种语言的存在,语言背景混乱加剧,使得数据更为复杂。 在本文中,我们提议了一种与语言有关的关注机制,以减少基于等同调控理论(EC)的E2E代码转换 ASR模型的多语种背景混乱。语言理论要求,在代码转换句中出现的任何单语片段都必须在单语句句中发生。它建立了单语数据和代码转换数据之间的桥梁。通过计算多种语言各自的注意力,我们的方法可以有效地从丰富的单语种数据中传输语言知识。我们评估了我们关于 ASRU 2019 Mandarin- Eng 代码抽动挑战数据集的方法。与基线模型相比,拟议方法实现了11.37 % 的相对混合错误率降低。