Neural network language model (NNLM) plays an essential role in automatic speech recognition (ASR) systems, especially in adaptation tasks when text-only data is available. In practice, an NNLM is typically trained on a combination of data sampled from multiple corpora. Thus, the data sampling strategy is important to the adaptation performance. Most existing works focus on designing static sampling strategies. However, each corpus may show varying impacts at different NNLM training stages. In this paper, we introduce a novel adaptive multi-corpora training algorithm that dynamically learns and adjusts the sampling probability of each corpus along the training process. The algorithm is robust to corpora sizes and domain relevance. Compared with static sampling strategy baselines, the proposed approach yields remarkable improvement by achieving up to relative 7% and 9% word error rate (WER) reductions on in-domain and out-of-domain adaptation tasks, respectively.
翻译:神经网络语言模型(NNLM)在自动语音识别(ASR)系统中发挥着重要作用,特别是在只有文本数据的情况下,在适应任务方面。在实践中,NNLM通常在综合从多个公司抽样的数据方面接受培训。因此,数据抽样战略对适应性表现很重要。大多数现有工作侧重于设计静态抽样战略。然而,每个实体都可能在不同NNLM培训阶段显示出不同的影响。在本文中,我们引入了一种新的适应性多公司培训算法,在培训过程中动态地学习并调整每个实体的抽样概率。算法对于公司规模和领域相关性是强有力的。与静态抽样战略基线相比,拟议方法通过分别实现约7%和9%字差率的内部和外部适应任务,取得了显著的改进。