Federated learning (FL) is a promising solution to enable many AI applications, where sensitive datasets from distributed clients are needed for collaboratively training a global model. FL allows the clients to participate in the training phase, governed by a central server, without sharing their local data. One of the main challenges of FL is the communication overhead, where the model updates of the participating clients are sent to the central server at each global training round. Over-the-air computation (AirComp) has been recently proposed to alleviate the communication bottleneck where the model updates are sent simultaneously over the multiple-access channel. However, simple averaging of the model updates via AirComp makes the learning process vulnerable to random or intended modifications of the local model updates of some Byzantine clients. In this paper, we propose a transmission and aggregation framework to reduce the effect of such attacks while preserving the benefits of AirComp for FL. For the proposed robust approach, the central server divides the participating clients randomly into groups and allocates a transmission time slot for each group. The updates of the different groups are then aggregated using a robust aggregation technique. We extend our approach to handle the case of non-i.i.d. local data, where a resampling step is added before robust aggregation. We analyze the convergence of the proposed approach for both cases of i.i.d. and non-i.i.d. data and demonstrate that the proposed algorithm converges at a linear rate to a neighborhood of the optimal solution. Experiments on real datasets are provided to confirm the robustness of the proposed approach.
翻译:联邦学习(FL)是一个很有希望的解决方案,可以让许多AI应用程序(需要分布式客户的敏感数据集来协作培训全球模式)成为全球模式。FL允许客户参加培训阶段,由中央服务器管理,不分享其本地数据。FL的主要挑战之一是通信管理,参与客户的示范更新在每次全球培训回合中都发送到中央服务器。最近提议了空外计算(AirComp),以缓解通信瓶颈,其中模型更新同时通过多接入频道发送。然而,通过AirComp 简单平均地将模型更新的模型用于共享,使得学习过程容易受到一些拜占庭客户本地模型更新随机或预定修改的伤害。在本文件中,我们提出了一个传输和汇总框架,以减少此类袭击的影响,同时保留FL的AirCom的效益。对于拟议的稳健方法,中央服务器随机将参与客户分成一组,并为每个组分配一个传输时间档。随后,不同组的更新将使用一个稳健的合并技术进行汇总。我们把真实的方法扩大到了不稳健的直线性,在分析数据之前,我们又增加了一个不稳性数据。我们提出的直线级数据。