In Federated Learning a number of clients collaborate to train a model without sharing their data. Client models are optimized locally and are communicated through a central hub called server. A major challenge is to deal with heterogeneity among clients' data which causes the local optimization to drift away with respect to the global objective. In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into Federated Learning optimization recently. However, the existing solutions propagate the error of their estimations, throughout the optimization trajectory which leads to inaccurate approximations of the clients' drift and ultimately failure to remove them properly. In this paper, we address this issue by introducing an adaptive algorithm that efficiently reduces clients' drift. Compared to the previous works on adapting variance reduction to Federated Learning, our approach uses less or the same level of communication bandwidth, computation or memory. Additionally, it addresses the instability problem--prevalent in prior work, caused by increasing norm of the estimates which makes our approach a much more practical solution for large scale Federated Learning settings. Our experimental results demonstrate that the proposed algorithm converges significantly faster and achieves higher accuracy compared to the baselines in an extensive set of Federated Learning benchmarks.
翻译:在联邦学习联合会中,一些客户在不分享数据的情况下合作培训模型。客户模型在当地优化,并通过一个称为服务器的中心枢纽进行传播。一个重大挑战是处理客户数据的差异性,这种差异性导致地方优化偏离全球目标。为了估计并因此消除这种漂移,最近已将减少差异的技术纳入联邦学习联合会的优化中。但是,现有的解决方案在整个优化轨迹中传播其估计错误,导致客户漂移的近似不准确,最终无法正确去除。在本文件中,我们通过引入适应性算法来解决这一问题,从而有效减少客户的漂移。与以往调整差异以适应联邦学习联合会的数据相比,我们的方法使用较少或相同水平的通信带宽、计算或记忆。此外,它解决了先前工作中的不稳定性问题,其原因是增加了估算规范,使我们的方法成为大规模联邦学习环境的一个更加实用的解决方案。我们的实验结果表明,拟议的算法比一套广泛的联邦学习基准的基线要快得多,而且更加准确。