In an asynchronous federated learning framework, the server updates the global model once it receives an update from a client instead of waiting for all the updates to arrive as in the synchronous setting. This allows heterogeneous devices with varied computing power to train the local models without pausing, thereby speeding up the training process. However, it introduces the stale model problem, where the newly arrived update was calculated based on a set of stale weights that are older than the current global model, which may hurt the convergence of the model. In this paper, we present an asynchronous federated learning framework with a proposed adaptive weight aggregation algorithm, referred to as AsyncFedED. To the best of our knowledge this aggregation method is the first to take the staleness of the arrived gradients, measured by the Euclidean distance between the stale model and the current global model, and the number of local epochs that have been performed, into account. Assuming general non-convex loss functions, we prove the convergence of the proposed method theoretically. Numerical results validate the effectiveness of the proposed AsyncFedED in terms of the convergence rate and model accuracy compared to the existing methods for three considered tasks.
翻译:在一个不同步的联结式学习框架内,服务器在从客户收到更新后更新全球模型,而不是等待所有更新后到达同步设置时更新全球模型。允许具有不同计算功率的不同设备在不暂停的情况下对本地模型进行培训,从而加快培训过程。然而,它引入了僵化模型问题,即新更新是根据一套比当前全球模型更老的固定重量计算出来的,这可能会损害模型的趋同。在本文件中,我们提出了一个不同步的混合学习框架,并附有一个拟议的适应性重力汇总算法,称为AsyncFedED。我们最了解的是,这一汇总方法首先根据Stare模型与当前全球模型之间的远距离以及已经执行的本地粒子数量来衡量的成熟梯度。假设一般的非 convex损失功能,我们从理论上证明了拟议方法的趋同,将AsynFED的三种方法的精确度进行比较。