Speech model adaptation is crucial to handle the discrepancy between server-side proxy training data and actual data received on users' local devices. With the use of federated learning (FL), we introduce an efficient approach on continuously adapting neural network language models (NNLMs) on private devices with applications on automatic speech recognition (ASR). To address the potential speech transcription errors in the on-device training corpus, we perform empirical studies on comparing various strategies of leveraging token-level confidence scores to improve the NNLM quality in the FL settings. Experiments show that compared with no model adaptation, the proposed method achieves relative 2.6% and 10.8% word error rate (WER) reductions on two speech evaluation datasets, respectively. We also provide analysis in evaluating privacy guarantees of our presented procedure.
翻译:语音模型的适应对于处理服务器侧代用培训数据与用户本地设备实际数据之间的差异至关重要。 通过使用联合学习(FL),我们引入了一种高效的方法,通过自动语音识别应用程序,不断调整私人设备上的神经网络语言模型(NNLMs ) 。为了解决设备培训中潜在的语音转录错误,我们进行了经验研究,比较了利用象征性信任分数提高FL环境中NNLM质量的各种战略。实验表明,与没有模型适应相比,拟议方法分别实现了两个语音评价数据集相对2.6%和10.8%的字差率的降低。我们还分析了如何评价我们提出的程序的隐私保障。