Federated learning (FL) is a privacy-promoting framework that enables potentially large number of clients to collaboratively train machine learning models. In a FL system, a server coordinates the collaboration by collecting and aggregating clients' model updates while the clients' data remains local and private. A major challenge in federated learning arises when the local data is heterogeneous -- the setting in which performance of the learned global model may deteriorate significantly compared to the scenario where the data is identically distributed across the clients. In this paper we propose FedDPMS (Federated Differentially Private Means Sharing), an FL algorithm in which clients deploy variational auto-encoders to augment local datasets with data synthesized using differentially private means of latent data representations communicated by a trusted server. Such augmentation ameliorates effects of data heterogeneity across the clients without compromising privacy. Our experiments on deep image classification tasks demonstrate that FedDPMS outperforms competing state-of-the-art FL methods specifically designed for heterogeneous data settings.
翻译:联邦学习(Federated Learning, FL)是一种促进隐私保护的框架,使得潜在的大量客户端可以协作地训练机器学习模型。在FL系统中,一台服务器负责协调协作并汇总客户端的模型更新,而客户端的数据保持本地和私密。联邦学习面临的主要挑战之一是当本地数据异构性很大时,学习的全局模型的性能可能会明显下降。在本文中,我们提出了FedDPMS(Federated Differentially Private Means Sharing),这是一种FL算法,其中客户端部署变分自编码器来增广本地数据集,其中综合使用了通过可信服务器通信的差分隐私潜在数据表示的平均值。这样的增量改善了客户端之间数据异构的影响,而不会破坏隐私。我们在深度图像分类任务上的实验表明,FedDPMS优于专门为异构数据设置设计的竞争最先进FL方法。