This study investigates clustered federated learning (FL), one of the formulations of FL with non-i.i.d. data, where the devices are partitioned into clusters and each cluster optimally fits its data with a localized model. We propose a clustered FL framework that incorporates a nonconvex penalty to pairwise differences of parameters. Without a priori knowledge of the set of devices in each cluster and the number of clusters, this framework can autonomously estimate cluster structures. To implement the proposed framework, we introduce a novel clustered FL method called Fusion Penalized Federated Clustering (FPFC). Building upon the standard alternating direction method of multipliers (ADMM), FPFC can perform partial updates at each communication round and allows parallel computation with variable workload. These strategies significantly reduce the communication cost while ensuring privacy, making it practical for FL. We also propose a new warmup strategy for hyperparameter tuning in FL settings and explore the asynchronous variant of FPFC (asyncFPFC). Theoretical analysis provides convergence guarantees for FPFC with general losses and establishes the statistical convergence rate under a linear model with squared loss. Extensive experiments have demonstrated the superiority of FPFC compared to current methods, including robustness and generalization capability.
翻译:暂无翻译