Federated Learning (FL) is a collaborative machine learning (ML) framework that combines on-device training and server-based aggregation to train a common ML model among distributed agents. In this work, we propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. Considering limited wireless communication resources, we investigate the effect of different scheduling policies and aggregation designs on the convergence performance. Driven by the importance of reducing the bias and variance of the aggregated model updates, we propose a scheduling policy that jointly considers the channel quality and training data representation of user devices. The effectiveness of our channel-aware data-importance-based scheduling policy, compared with state-of-the-art methods proposed for synchronous FL, is validated through simulations. Moreover, we show that an ``age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.
翻译:联邦学习(FL)是一种结合设备端训练与服务器端聚合的协作式机器学习(ML)框架,旨在分布式智能体间训练共享的ML模型。本文提出一种采用周期性聚合的异步FL设计,以应对FL系统中的滞后节点问题。考虑到无线通信资源有限,我们研究了不同调度策略与聚合设计对收敛性能的影响。基于降低聚合模型更新的偏差与方差的重要性,我们提出了一种联合考虑用户设备信道质量与训练数据代表性的调度策略。通过仿真验证,与现有同步FL方法相比,所提出的信道感知数据重要性调度策略具有显著优势。此外,研究表明,在异步FL场景中,“时效感知”的聚合权重设计能显著提升学习性能。