We propose a novel framework to study asynchronous federated learning optimization with delays in gradient updates. Our theoretical framework extends the standard FedAvg aggregation scheme by introducing stochastic aggregation weights to represent the variability of the clients update time, due for example to heterogeneous hardware capabilities. Our formalism applies to the general federated setting where clients have heterogeneous datasets and perform at least one step of stochastic gradient descent (SGD). We demonstrate convergence for such a scheme and provide sufficient conditions for the related minimum to be the optimum of the federated problem. We show that our general framework applies to existing optimization schemes including centralized learning, FedAvg, asynchronous FedAvg, and FedBuff. The theory here provided allows drawing meaningful guidelines for designing a federated learning experiment in heterogeneous conditions. In particular, we develop in this work FedFix, a novel extension of FedAvg enabling efficient asynchronous federated training while preserving the convergence stability of synchronous aggregation. We empirically demonstrate our theory on a series of experiments showing that asynchronous FedAvg leads to fast convergence at the expense of stability, and we finally demonstrate the improvements of FedFix over synchronous and asynchronous FedAvg.
翻译:我们提出了一个新的框架,以研究非同步的联结式学习优化,并延缓梯度更新。我们的理论框架通过引入随机集重权来代表客户更新时间的变异性来扩展标准的FedAvg汇总计划,以体现客户的更新时间,例如由于硬件能力的差异。我们的形式主义适用于客户拥有不同数据集并至少执行一步随机梯度下降的一般联结环境。我们为这样一个计划展示了趋同,并为相关的最低要求提供了足够条件,以达到最佳的联结问题。我们展示了我们的总框架适用于现有的优化计划,包括集中学习、FedAvg、无同步的FedAvg和FedBuff。这里提供的理论允许为设计一个不同条件下的联结式学习实验制定有意义的指导方针。特别是,我们在这项工作中开发了FedFFixix,这是FedAvg的一个新扩展,使非同步的饱和热度培训能够有效,同时保持同步聚合的稳定性。我们的经验展示了我们关于一系列实验的理论,表明FedAvg的同步性改进最终导致美联会的成本稳定。