Federated Averaging (FedAVG) has become the most popular federated learning algorithm due to its simplicity and low communication overhead. We use simple examples to show that FedAVG has the tendency to sew together the optima across the participating clients. These sewed optima exhibit poor generalization when used on a new client with new data distribution. Inspired by the invariance principles in (Arjovsky et al., 2019; Parascandolo et al., 2020), we focus on learning a model that is locally optimal across the different clients simultaneously. We propose a modification to FedAVG algorithm to include masked gradients (AND-mask from (Parascandolo et al., 2020)) across the clients and uses them to carry out an additional server model update. We show that this algorithm achieves better accuracy (out-of-distribution) than FedAVG, especially when the data is non-identically distributed across clients.
翻译:联合会(FedAVG)由于其简单和通信管理低廉,已成为最受欢迎的联合学习算法(FedAVG),我们使用简单的例子来表明FedAVG倾向于在参与客户之间缝合optima。这些Sed Optima在使用新数据发布新客户时,表现不善。在(Arjovsky等人,2019年;Parascandolo等人,2020年)的不妥协原则的启发下,我们侧重于学习一种模式,该模式在本地同时对不同客户最优化。我们建议修改FedAVG算法,将隐藏的梯度(来自(Parascandolo等人,2020年)的AND-mask)纳入客户,并利用这些梯度进行额外的服务器模型更新。我们表明,这一算法比FedAVG的准确性(非分配性)要好,特别是当数据在客户之间非特定分布时。