Federated learning can be used to train machine learning models on the edge on local data that never leave devices, providing privacy by default. This presents a challenge pertaining to the communication and computation costs associated with clients' devices. These costs are strongly correlated with the size of the model being trained, and are significant for state-of-the-art automatic speech recognition models. We propose using federated dropout to reduce the size of client models while training a full-size model server-side. We provide empirical evidence of the effectiveness of federated dropout, and propose a novel approach to vary the dropout rate applied at each layer. Furthermore, we find that federated dropout enables a set of smaller sub-models within the larger model to independently have low word error rates, making it easier to dynamically adjust the size of the model deployed for inference.
翻译:联邦学习可以用来在从未离开设备的地方数据边缘上培训机器学习模型,从而提供默认隐私。这提出了与客户设备相关的通信和计算成本方面的挑战。这些成本与所培训模型的规模密切相关,对于最先进的自动语音识别模型来说意义重大。我们建议使用联邦式辍学来缩小客户模型的大小,同时培训一个全尺寸的模型服务器侧面。我们提供了关于联邦式辍学有效性的经验证据,并提出了改变每一层辍学率的新办法。此外,我们发现,联邦式辍学使得较大模型中的一组较小的子模型能够独立地出现低字误差率,从而更容易动态地调整用于推断的模型的大小。