Present-day federated learning (FL) systems deployed over edge networks have to consistently deal with a large number of workers with high degrees of heterogeneity in data and/or computing capabilities. This diverse set of workers necessitates the development of FL algorithms that allow: (1) flexible worker participation that grants the workers' capability to engage in training at will, (2) varying number of local updates (based on computational resources) at each worker along with asynchronous communication with the server, and (3) heterogeneous data across workers. To address these challenges, in this work, we propose a new paradigm in FL called ``Anarchic Federated Learning'' (AFL). In stark contrast to conventional FL models, each worker in AFL has complete freedom to choose i) when to participate in FL, and ii) the number of local steps to perform in each round based on its current situation (e.g., battery level, communication channels, privacy concerns). However, AFL also introduces significant challenges in algorithmic design because the server needs to handle the chaotic worker behaviors. Toward this end, we propose two Anarchic FedAvg-like algorithms with two-sided learning rates for both cross-device and cross-silo settings, which are named AFedAvg-TSLR-CD and AFedAvg-TSLR-CS, respectively. For general worker information arrival processes, we show that both algorithms retain the highly desirable linear speedup effect in the new AFL paradigm. Moreover, we show that our AFedAvg-TSLR algorithmic framework can be viewed as a {\em meta-algorithm} for AFL in the sense that they can utilize advanced FL algorithms as worker- and/or server-side optimizers to achieve enhanced performance under AFL. We validate the proposed algorithms with extensive experiments on real-world datasets.
翻译:在边缘网络上部署的当前联盟式学习系统(FL)必须一致应对大量在数据和(或)计算能力方面具有高度异质的工人。这组不同的工人需要开发FL算法,以便:(1) 灵活的工人参与,使工人能够随时参加培训,(2) 每位工人的本地更新数量不尽相同(基于计算资源),同时与服务器进行不同步的通信,(3) 不同工人的数据。为了应对这些挑战,在这项工作中,我们在FL中提出了一个名为“Anarchic 联邦学习”(AFL)的新范例。与传统的FL模型形成鲜明对照的是,AFLL每个工人在参加FL时完全可以选择(i),和(ii)根据目前的情况(例如,电池水平、通信渠道、隐私问题),每轮在当地进行更新的次数不同。但是,AFLFLF也给逻辑设计带来重大挑战,因为服务器需要处理混乱的工人行为。 至此,我们提议两个AFDA和AFLL的跨级-FLLLLL都可分别显示A的跨级算法。