Federated Learning (FL) exploits the computation power of edge devices, typically mobile phones, while addressing privacy by letting data stay where it is produced. FL has been used by major service providers to improve item recommendations, virtual keyboards and text auto-completion services. While appealing, FL performance is hampered by multiple factors: i) differing capabilities of participating clients (e.g., computing power, memory and network connectivity); ii) strict training constraints where devices must be idle, plugged-in and connected to an unmetered WiFi; and iii) data heterogeneity (a.k.a non-IIDness). Together, these lead to uneven participation, straggling, dropout and consequently slow down convergence, challenging the practicality of FL for many applications. In this paper, we present GeL, the Guess and Learn algorithm, that significantly speeds up convergence by guessing model updates for each client. The power of GeL is to effectively perform ''free'' learning steps without any additional gradient computations. GeL provides these guesses through clever use of moments in the Adam optimizer in combination with the last computed gradient on clients. Our extensive experimental study involving five standard FL benchmarks shows that GeL speeds up the convergence up to 1.64x in heterogeneous systems in the presence of data non-IIDness, saving tens of thousands of gradient computations.
翻译:联邦学习联合会(FL)利用边缘装置(通常是移动电话)的计算能力,同时通过让数据留在生产地点解决隐私问题,同时让数据留在生产地点解决隐私问题。主要服务供应商利用FL来改进项目建议、虚拟键盘和文本自动完成服务。FL业绩虽然具有吸引力,但受到多种因素的阻碍:一)参与客户能力不同(例如计算能力、记忆和网络连接);二)设备必须闲置、插入和连接到未完成的WiFi的功能的严格培训限制;三)数据异质性(a.k.a.a.a.o.a/n-IIness)。这些情况加在一起导致参与不均、摇晃动、辍学并进而放慢趋同速度,对FL许多应用程序的实用性提出了挑战。在本文件中,我们介绍了GEL、猜算和学习算法,通过为每个客户的模型更新,大大加快了趋同速度。GEL的力量是有效完成“免费”学习步骤,而没有额外的梯度计算。GEL提供这些猜想,通过精巧使用亚当优化的时段的时段,同时结合,将Fl-II最后的趋一致化速度显示Fx的递化速度在Slx的递增速度中的数据。