Federated learning is a distributed learning framework that takes full advantage of private data samples kept on edge devices. In real-world federated learning systems, these data samples are often decentralized and Non-Independently Identically Distributed (Non-IID), causing divergence and performance degradation in the federated learning process. As a new solution, clustered federated learning groups federated clients with similar data distributions to impair the Non-IID effects and train a better model for every cluster. This paper proposes StoCFL, a novel clustered federated learning approach for generic Non-IID issues. In detail, StoCFL implements a flexible CFL framework that supports an arbitrary proportion of client participation and newly joined clients for a varying FL system, while maintaining a great improvement in model performance. The intensive experiments are conducted by using four basic Non-IID settings and a real-world dataset. The results show that StoCFL could obtain promising cluster results even when the number of clusters is unknown. Based on the client clustering results, models trained with StoCFL outperform baseline approaches in a variety of contexts.
翻译:联邦学习是一个分布式学习框架,它充分利用了在边缘设备上保存的私人数据样本。在现实世界联合学习系统中,这些数据样本往往分散和不独立地分布(非IID),导致联邦学习过程中的差异和性能退化。作为一种新解决方案,分组联合学习小组的客户结成了类似的数据分布,以损害非IID效应,并为每个组别培训更好的模式。本文提议StoCFL,这是针对通用非IID问题的新型组合学习方法。详细来说,StoCFL采用灵活的CFL框架,支持任意比例的客户参与和新加入客户的不同FL系统,同时保持模型性能的巨大改进。密集实验是通过使用四个基本的非IID设置和一个真实世界数据集进行的。结果显示StoCFL即使在群集数量不详的情况下也能获得有希望的群集结果。根据客户群集结果,在各种情况下,与StoCFL培训的模型超越了符合要求的基线方法。</s>