Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon. In fact, obtaining an algorithm for FL which is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, Mime, which i) mitigates client drift and ii) adapts arbitrary centralized optimization algorithms such as momentum and Adam to the cross-device federated learning setting. Mime uses a combination of control-variates and server-level statistics (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method run on iid data. We prove a reduction result showing that Mime can translate the convergence of a generic algorithm in the centralized setting into convergence in the federated setting. Further, we show that when combined with momentum based variance reduction, Mime is provably faster than any centralized method--the first such result. We also perform a thorough experimental exploration of Mime's performance on real world datasets.
翻译:联邦学习(FL)是优化最优化的一个挑战性环境,因为不同客户之间数据的多样性导致客户漂移现象。事实上,到目前为止,为FL获得一个比简单集中培训更统一的算法是一个主要的开放问题。在这个工作中,我们提议了一个通用算法框架,MIME, 它能减轻客户漂移, 并且ii) 使任意集中优化算法(例如势头和亚当)适应交叉偏差的联邦学习环境。Mime在每个客户更新步骤中使用控制变量和服务器级别统计数据(例如势头)的组合,以确保每个本地更新都模仿集中方法在iid数据上运行的情况。我们证明一个减少的结果表明Mime可以将中央集成的通用算法的趋同转化为在联邦环境中的趋同。此外,我们表明,与基于差异减少的势头相结合,Mime可以比任何集中方法-第一个结果更快。我们还对Mime在真实世界数据集上的性能进行了彻底的实验探索。