Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework, where many clients collaboratively train a machine learning model by exchanging model updates with a parameter server instead of sharing their raw data. Nevertheless, FL training suffers from slow convergence and unstable performance due to stragglers caused by the heterogeneous computational resources of clients and fluctuating communication rates. This paper proposes a coded FL framework, namely stochastic coded federated learning (SCFL) to mitigate the straggler issue. In the proposed framework, each client generates a privacy-preserving coded dataset by adding additive noise to the random linear combination of its local data. The server collects the coded datasets from all the clients to construct a composite dataset, which helps to compensate for the straggling effect. In the training process, the server as well as clients perform mini-batch stochastic gradient descent (SGD), and the server adds a make-up term in model aggregation to obtain unbiased gradient estimates. We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning. Besides, we demonstrate a privacy-performance tradeoff of the proposed SCFL method by analyzing the influence of the privacy constraint on the convergence rate. Finally, numerical experiments corroborate our analysis and show the benefits of SCFL in achieving fast convergence while preserving data privacy.
翻译:联邦学习(FL)作为一个保护隐私的分布式机器学习框架吸引了大量关注,在这个框架中,许多客户通过与参数服务器交换模型更新,而不是共享原始数据,对机器学习模式进行了合作培训;然而,由于客户的多种计算资源和波动通信率造成的分解效应,FL培训工作进展缓慢,业绩不稳定;本文件提议了一个编码式FL框架,即随机编码化的编码化联邦学习(SCFL),以缓解分层问题;在拟议框架中,每个客户通过在本地数据的随机线性组合中添加添加添加噪音,从而生成一个保存隐私的编码化数据。服务器收集所有客户的编码化数据集,以构建综合数据集,这有助于弥补分层效应。在培训过程中,服务器和客户进行小型组合式混凝结式混凝结梯度脱色脱色(SGD),服务器在模型集中增加一个化期,以获得公正的梯度估计。我们通过相互信息差异化隐私(MI-DP)来描述隐私保护编码化数据集群集,并分析升级后,还演示了SLFL的保密性统化分析的升级率。