Federated Learning (FL) has emerged as a promising technique for edge devices to collaboratively learn a shared prediction model, while keeping their training data on the device, thereby decoupling the ability to do machine learning from the need to store the data in the cloud. However, FL is difficult to implement realistically, both in terms of scale and systems heterogeneity. Although there are a number of research frameworks available to simulate FL algorithms, they do not support the study of scalable FL workloads on heterogeneous edge devices. In this paper, we present Flower -- a comprehensive FL framework that distinguishes itself from existing platforms by offering new facilities to execute large-scale FL experiments and consider richly heterogeneous FL device scenarios. Our experiments show Flower can perform FL experiments up to 15M in client size using only a pair of high-end GPUs. Researchers can then seamlessly migrate experiments to real devices to examine other parts of the design space. We believe Flower provides the community with a critical new tool for FL study and development.
翻译:联邦学习(FL)已成为一种很有希望的技术,使边缘设备能够合作学习一个共同的预测模型,同时在设备上保留其培训数据,从而将机器学习的能力与在云中储存数据的必要性脱钩,然而,FL很难现实地实施,在规模和系统差异性方面都是如此。虽然有一些研究框架可以模拟FL算法,但它们并不支持对可伸缩的多变性装置的FL工作量的研究。在本文中,我们介绍Flower -- -- 全面的FL框架,它通过提供实施大规模FL实验的新设施以及考虑丰富多样的FL设备设想,使自己与现有平台区别开来。我们的实验展示花可以使用高端的GPU进行多达15M的客户规模的FL实验。然后,研究人员可以无缝地将实验迁移到实际设备,以检查设计空间的其他地方。我们认为Flower为社区提供了一个关键的FL研发新工具。