Federated learning has gained popularity as a means of training models distributed across the wireless edge. The paper introduces delay-aware federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training by addressing communication delays between edge and cloud. DFL employs multiple stochastic gradient descent iterations on device datasets during each global aggregation interval and intermittently aggregates model parameters through edge servers in local subnetworks. The cloud server synchronizes the local models with the global deployed model computed via a local-global combiner at global synchronization. The convergence behavior of DFL is theoretically investigated under a generalized data heterogeneity metric. A set of conditions is obtained to achieve the sub-linear convergence rate of O(1/k). Based on these findings, an adaptive control algorithm is developed for DFL, implementing policies to mitigate energy consumption and edge-to-cloud communication latency while aiming for a sublinear convergence rate. Numerical evaluations show DFL's superior performance in terms of faster global model convergence, reduced resource consumption, and robustness against communication delays compared to existing FL algorithms. In summary, this proposed method offers improved efficiency and satisfactory results when dealing with both convex and non-convex loss functions.
翻译:联邦学习已经成为分布式机器学习模型训练的一种流行方法,本文引入了延迟感知的联邦学习 (DFL) 来解决边缘和云端之间通信延迟问题,从而提高分布式机器学习模型训练的效率。DFL 在每个全局聚合周期内使用多个随机梯度下降迭代来处理设备数据集,并通过本地子网络中的边缘服务器间歇性地聚合模型参数。云服务器通过本地全局合并器将本地模型与在全局同步期计算的全局部署模型进行同步。在广义数据异质性度量下理论地研究了 DFL 的收敛行为,并获得了一组条件,以实现 O(1/k) 的次线性收敛率。基于这些结果,为 DFL 开发了自适应控制算法,实现了在努力达到次线性收敛率的同时,执行减少能源消耗和边缘到云端通信延迟的策略。数值评估显示,与现有的 FL 算法相比,DFL 具有更快的全局模型收敛、更少的资源消耗,以及对通信延迟的稳健性。总之,本文提出了一种改进效率、在处理凸和非凸损失函数时都有满意结果的方法。