Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead. In this work, we introduce domain-prompts, a methodology that involves training a small number of domain embedding parameters to prime a Transformer-based Language Model (LM) to a particular domain. Using this domain-adapted LM for rescoring ASR hypotheses can achieve 7-13% WER reduction for a new domain with just 1000 unlabeled textual domain-specific sentences. This improvement is comparable or even better than fully fine-tuned models even though just 0.02% of the parameters of the base LM are updated. Additionally, our method is deployment-friendly as the learnt domain embeddings are prefixed to the input to the model rather than changing the base model architecture. Therefore, our method is an ideal choice for on-the-fly adaptation of LMs used in ASR systems to progressively scale it to new domains.
翻译:自动语音识别( ASR) 系统在众多不同领域的工业应用中发现,这些系统在众多工业应用中都使用,因此需要适应记忆和部署管理管理较少的新领域。在这项工作中,我们引入了域速(域速),这种方法包括培训少量域嵌入参数,将基于变换器的语言模型(LM)引入特定领域。使用这个域适应LM(LM)重新校准 ASR 假设可以实现7-13% WER(WER)的削减,新域只有1000个未标记的文本域特定句。这一改进比完全精确调整的模型相近甚至更好,尽管只有0.02 %的LM基准参数得到更新。此外,我们的方法有利于部署,因为所学域嵌入的域嵌入与模型输入前是预设的,而不是改变基模型结构。因此,我们的方法是将ASR系统使用的LM(LM) 系统使用的LM(LM) 逐步推广到新域的理想选择。