To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network. However, existing DFL suffers from high inconsistency among local clients, which results in severe distribution shift and inferior performance compared with centralized FL (CFL), especially on heterogeneous data or sparse communication topology. To alleviate this issue, we propose two DFL algorithms named DFedSAM and DFedSAM-MGS to improve the performance of DFL. Specifically, DFedSAM leverages gradient perturbation to generate local flat models via Sharpness Aware Minimization (SAM), which searches for models with uniformly low loss values. DFedSAM-MGS further boosts DFedSAM by adopting Multiple Gossip Steps (MGS) for better model consistency, which accelerates the aggregation of local flat models and better balances communication complexity and generalization. Theoretically, we present improved convergence rates $\small \mathcal{O}\big(\frac{1}{\sqrt{KT}}+\frac{1}{T}+\frac{1}{K^{1/2}T^{3/2}(1-\lambda)^2}\big)$ and $\small \mathcal{O}\big(\frac{1}{\sqrt{KT}}+\frac{1}{T}+\frac{\lambda^Q+1}{K^{1/2}T^{3/2}(1-\lambda^Q)^2}\big)$ in non-convex setting for DFedSAM and DFedSAM-MGS, respectively, where $1-\lambda$ is the spectral gap of gossip matrix and $Q$ is the number of MGS. Empirically, our methods can achieve competitive performance compared with CFL methods and outperform existing DFL methods.
翻译:为了减轻联邦学习联合会(FL)的隐私泄漏和通信负担,分散的FL(DFL)抛弃了中央服务器,而每个客户只与分散的通信网络中的邻居沟通。然而,现有的DFL在本地客户之间有很大的不一致,导致分配变化严重,业绩比集中的FL(CFL)差,特别是在各种数据或分散的通信地形上。为了缓解这一问题,我们提议了两个名为DFedSAM和DFedSAM-MGS(DF)的DFL算法,以提高DFL的性能。具体地说,DFedSAM(DF)利用梯度过弯来生成本地平板模型。DFL(DFD) 来加快当地平板模型的集成,改善通信复杂性和一般化。我们用MDFLS2(O_BD) 来提高(OBK_T_TQ_TQ_T}SMDS_B_2}(S_Q_Q_Q_Q_B_B_B_Q_B_Q_Q_Q_Q_Q_Q_BAR_Q_Q_Q_QQ_Q_Q_BDFAR_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_Q_BFFF)