Federated Learning (FL) is a collaborative machine learning (ML) framework that combines on-device training and server-based aggregation to train a common ML model among distributed agents. In this work, we propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. Considering limited wireless communication resources, we investigate the effect of different scheduling policies and aggregation designs on the convergence performance. Driven by the importance of reducing the bias and variance of the aggregated model updates, we propose a scheduling policy that jointly considers the channel quality and training data representation of user devices. The effectiveness of our channel-aware data-importance-based scheduling policy, compared with state-of-the-art methods proposed for synchronous FL, is validated through simulations. Moreover, we show that an ``age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.
翻译:联邦学习是一种协作机器学习框架,它结合了设备端的训练和基于服务器的聚合,以在分布式代理之间训练共同的机器学习模型。在这项工作中,我们提出了一种带有定期聚合的异步联邦学习设计,以解决联邦学习系统中的滞后问题。考虑到有限的无线通信资源,我们研究了不同调度策略和聚合设计对收敛性能的影响。出于降低聚合模型更新的偏差和方差的重要性,我们提出了一种调度策略,它同时考虑了用户设备的信道质量和训练数据表示。通过模拟验证了我们的基于信道和数据重要性的调度策略相比于同步联邦学习提出的最新方法的有效性。此外,我们展示了一种“年龄感知”聚合加权设计可以显著提高异步联邦学习设置下的学习性能。