Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data. The characteristics of non-i.i.d. data across the network, low device participation, high communication costs, and the mandate that data remain private bring challenges in understanding the convergence of FL algorithms, particularly with regards to how convergence scales with the number of participating devices. In this paper, we focus on Federated Averaging (FedAvg)--arguably the most popular and effective FL algorithm class in use today--and provide a unified and comprehensive study of its convergence rate. Although FedAvg has recently been studied by an emerging line of literature, a systematic study of how FedAvg's convergence scales with the number of participating devices in the fully heterogeneous FL setting is lacking--a crucial issue whose answer would shed light on the performance of FedAvg in large FL systems in practice. We fill this gap by providing a unified analysis that establishes convergence guarantees for FedAvg under strongly convex smooth, convex smooth problems, and overparameterized strongly convex smooth problems. We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies. While there have been linear speedup results from distributed optimization that assumes full participation, ours are the first to establish linear speedup for FedAvg under both statistical and system heterogeneity. For strongly convex and convex problems, we also characterize the corresponding convergence rates for the Nesterov accelerated FedAvg algorithm, which are the first linear speedup guarantees for momentum variants of FedAvg in convex settings. Empirical studies of the algorithms in various settings have supported our theoretical results.
翻译:联邦学习(FL) 从一组参与设备中联合学习一个模型,但不相互分享对方私人持有的数据。 非i.d.d.数据的特点,整个网络的数据,设备参与率低,通信成本高,以及数据保持私密的任务,都给理解FL算法的趋同性带来了挑战,特别是在如何与参与设备数目的趋同规模方面。在本文中,我们侧重于FedAvrial(FedAvg)-可以说是最受欢迎、最有效的FL算法类,对它的趋同率进行了统一而全面的研究。尽管FedAvg最近通过新的文献系列对数据进行了研究,但对于FedAvg的趋同性趋同性规模与完全不均的FLFL设置中的参与性能的趋同性提出了系统性的研究。 我们通过提供统一的分析,为FedAvg的趋同性趋同率提供了首次的趋同性保证,而FedAv的趋同性平稳的趋同性,尽管我们每次都以直线式的趋同速度的研究结果。