Federated learning (FL) is experiencing a fast booming with the wave of distributed machine learning and ever-increasing privacy concerns. In the FL paradigm, global model aggregation is handled by a centralized aggregate server based on local updated gradients trained on local nodes, which mitigates privacy leakage caused by the collection of sensitive information. With the increased computing and communicating capabilities of edge and IoT devices, applying FL on heterogeneous devices to train machine learning models becomes a trend. The synchronous aggregation strategy in the classic FL paradigm cannot effectively use the resources, especially on heterogeneous devices, due to its waiting for straggler devices before aggregation in each training round. Furthermore, in real-world scenarios, the disparity of data dispersed on devices (i.e. data heterogeneity) downgrades the accuracy of models. As a result, many asynchronous FL (AFL) paradigms are presented in various application scenarios to improve efficiency, performance, privacy, and security. This survey comprehensively analyzes and summarizes existing variants of AFL according to a novel classification mechanism, including device heterogeneity, data heterogeneity, privacy and security on heterogeneous devices, and applications on heterogeneous devices. Finally, this survey reveals rising challenges and presents potentially promising research directions in this under-investigated field.
翻译:联邦学习联盟(FL)正在经历随着分布式机器学习浪潮的浪潮和不断增加的隐私问题而迅速涌现;在FL范式中,全球模型汇总由一个中央集成服务器处理,该服务器以当地更新的梯度为基础,对当地节点进行了培训,以减少因收集敏感信息而导致的隐私泄漏;随着边际和IoT设备的计算和通信能力的提高,将FL应用于不同设备以培训机器学习模型成为趋势;传统FL范式中的同步聚合战略无法有效利用资源,特别是各种装置,因为每轮培训都等待在集成之前使用散装装置;此外,在现实世界情景中,分布在装置上的数据差异(即数据异质性)使模型的准确性降低;结果,在许多应用设想中提出了许多非同步的FL(AFL)范式,以提高效率、性能、隐私和安全性;这一调查全面分析和总结了ABL的现有变异式,根据新的分类机制,包括设备变异性、数据异性、数据异性、隐私和安全性、在这种变异性研究中展示了这种不断变化的实地设备。