Federated Learning (FL) is a promising framework for performing privacy-preserving, distributed learning with a set of clients. However, the data distribution among clients often exhibits non-IID, i.e., distribution shift, which makes efficient optimization difficult. To tackle this problem, many FL algorithms focus on mitigating the effects of data heterogeneity across clients by increasing the performance of the global model. However, almost all algorithms leverage Empirical Risk Minimization (ERM) to be the local optimizer, which is easy to make the global model fall into a sharp valley and increase a large deviation of parts of local clients. Therefore, in this paper, we revisit the solutions to the distribution shift problem in FL with a focus on local learning generality. To this end, we propose a general, effective algorithm, \texttt{FedSAM}, based on Sharpness Aware Minimization (SAM) local optimizer, and develop a momentum FL algorithm to bridge local and global models, \texttt{MoFedSAM}. Theoretically, we show the convergence analysis of these two algorithms and demonstrate the generalization bound of \texttt{FedSAM}. Empirically, our proposed algorithms substantially outperform existing FL studies and significantly decrease the learning deviation.
翻译:联邦学习联合会(FL)是进行隐私保护、与一组客户进行分散学习的一个很有希望的框架,然而,客户之间的数据分配往往显示非IID,即分配转移,这就使得高效优化变得困难。为了解决这一问题,许多FL算法侧重于通过提高全球模型的性能来减轻不同客户数据差异的影响。然而,几乎所有算法都利用经验风险最小化(ERM)作为地方优化器,这很容易使全球模型跌入一个尖锐的谷地,使当地客户的偏差增加。因此,在本文件中,我们重新审视了FL中分配转移问题的解决办法,重点是当地学习的一般性。为此,我们建议了一种一般性的、有效的算法,在清醒认识最小化(SAM)的地方优化器的基础上,发展一种动力FL算法,以连接本地和全球模型,\ textt{MoFedSAM}。理论上,我们展示了这两种算法的趋同性分析,并大大地展示了我们拟议的FML的通用性研究。