Federated Learning (FL) enables the multiple participating devices to collaboratively contribute to a global neural network model while keeping the training data locally. Unlike the centralized training setting, the non-IID, imbalanced (statistical heterogeneity) and distribution shifted training data of FL is distributed in the federated network, which will increase the divergences between the local models and the global model, further degrading performance. In this paper, we propose a flexible clustered federated learning (CFL) framework named FlexCFL, in which we 1) group the training of clients based on the similarities between the clients' optimization directions for lower training divergence; 2) implement an efficient newcomer device cold start mechanism for framework scalability and practicality; 3) flexibly migrate clients to meet the challenge of client-level data distribution shift. FlexCFL can achieve improvements by dividing joint optimization into groups of sub-optimization and can strike a balance between accuracy and communication efficiency in the distribution shift environment. The convergence and complexity are analyzed to demonstrate the efficiency of FlexCFL. We also evaluate FlexCFL on several open datasets and made comparisons with related CFL frameworks. The results show that FlexCFL can significantly improve absolute test accuracy by +10.6% on FEMNIST compared to FedAvg, +3.5% on FashionMNIST compared to FedProx, +8.4% on MNIST compared to FeSEM. The experiment results show that FlexCFL is also communication efficient in the distribution shift environment.
翻译:联邦学习联合会(FL)使多种参与设备能够协作为全球神经网络模式作出贡献,同时保持当地的培训数据。与中央培训环境不同,非IID、不平衡(统计异质性)和分布变化变化的培训数据在联邦网络中分布,这将增加地方模式和全球模式之间的差异,进一步降低业绩。我们提议了一个名为FlexCFL的灵活组合化学习框架(CFL),在这个框架中,我们1)根据客户优化方向之间的相似性,对客户进行培训,以降低培训差异;2)实施高效的新分发设备冷启动机制,以适应框架的可缩放性和实用性;3)灵活迁移客户,以迎接客户一级数据分布变化的挑战。FlexCFCFL可以将联合优化分为次优化组,并在分销转移环境中的准确性和通信效率之间取得平衡。我们分析了FlexCFCFCL的趋同效率。我们还评估了FLFCFC的几种开放数据设置,并对比了FFL的绝对值环境。