Automatic speech recognition (ASR) with federated learning (FL) makes it possible to leverage data from multiple clients without compromising privacy. The quality of FL-based ASR could be measured by recognition performance, communication and computation costs. When data among different clients are not independently and identically distributed (non-IID), the performance could degrade significantly. In this work, we tackle the non-IID issue in FL-based ASR with personalized FL, which learns personalized models for each client. Concretely, we propose two types of personalized FL approaches for ASR. Firstly, we adapt the personalization layer based FL for ASR, which keeps some layers locally to learn personalization models. Secondly, to reduce the communication and computation costs, we propose decoupled federated learning (DecoupleFL). On one hand, DecoupleFL moves the computation burden to the server, thus decreasing the computation on clients. On the other hand, DecoupleFL communicates secure high-level features instead of model parameters, thus reducing communication cost when models are large. Experiments demonstrate two proposed personalized FL-based ASR approaches could reduce WER by 2.3% - 3.4% compared with FedAvg. Among them, DecoupleFL has only 11.4% communication and 75% computation cost compared with FedAvg, which is also significantly less than the personalization layer based FL.
翻译:通过联合学习(FL)自动语音识别(ASR),可以利用多个客户的数据,而不损害隐私。基于FL的ASR的质量可以通过识别性能、通信和计算成本来衡量。当不同客户的数据不是独立和相同分布(非IID)时,性能会显著下降。在这项工作中,我们用个人化的FL FL解决基于FL的非II 问题,为每个客户学习个性化FL模式。具体地说,我们为ASR提出两种个人化FL方法。首先,我们为ASR调整基于个人化FL的FL层,保持一些层次,学习个性化模型。第二,为降低通信和计算成本,我们建议分解混合的Federal学习(Decuplefl)。一方面,我们将基于FOPL的计算负担转到服务器,从而减少客户的计算。另一方面,DecoupleFL通信安全性高度,而不是模型参数,因此当模型大时会降低通信成本。实验显示两种拟议的个人化FL-FL-FL-FL-FD-FD-FL-FL-FL-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-