Federated learning has gained popularity as a means of training models distributed across the wireless edge. The paper introduces delay-aware federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training by addressing communication delays between edge and cloud. DFL employs multiple stochastic gradient descent iterations on device datasets during each global aggregation interval and intermittently aggregates model parameters through edge servers in local subnetworks. The cloud server synchronizes the local models with the global deployed model computed via a local-global combiner at global synchronization. The convergence behavior of DFL is theoretically investigated under a generalized data heterogeneity metric. A set of conditions is obtained to achieve the sub-linear convergence rate of O(1/k). Based on these findings, an adaptive control algorithm is developed for DFL, implementing policies to mitigate energy consumption and edge-to-cloud communication latency while aiming for a sublinear convergence rate. Numerical evaluations show DFL's superior performance in terms of faster global model convergence, reduced resource consumption, and robustness against communication delays compared to existing FL algorithms. In summary, this proposed method offers improved efficiency and satisfactory results when dealing with both convex and non-convex loss functions.
翻译:联邦学习已成为分布在无线边缘的模型训练的手段。本文引入了延迟感知的联邦学习(DFL)来解决边缘和云之间的通信延迟,从而提高分布式机器学习模型训练的效率。DFL在每个全局聚合间隔期间对设备数据集进行多个随机梯度下降迭代,并通过本地子网络中的边缘服务器间歇性地聚合模型参数。云服务器通过本地-全局组合器将局部模型与全局部署模型进行同步,实现全局同步。在广义数据异质性度量下,对DFL的收敛行为进行了理论研究。获得了一组条件,以实现O(1/k)的次线性收敛率。基于这些发现,开发了DFL的自适应控制算法,实施政策以减少能源消耗和边缘到云的通信延迟,同时旨在获得次线性收敛率。数值评估表明,与现有FL算法相比,DFL在全局模型收敛速度更快,资源消耗更少,在通信延迟方面具有鲁棒性,效果更好。总之,本方法在处理凸面和非凸损失函数时都可以提供更高的效率和令人满意的结果。