Federated learning (FL) aims to minimize the communication complexity of training a model over heterogeneous data distributed across many clients. A common approach is local methods, where clients take multiple optimization steps over local data before communicating with the server (e.g., FedAvg). Local methods can exploit similarity between clients' data. However, in existing analyses, this comes at the cost of slow convergence in terms of the dependence on the number of communication rounds R. On the other hand, global methods, where clients simply return a gradient vector in each round (e.g., SGD), converge faster in terms of R but fail to exploit the similarity between clients even when clients are homogeneous. We propose FedChain, an algorithmic framework that combines the strengths of local methods and global methods to achieve fast convergence in terms of R while leveraging the similarity between clients. Using FedChain, we instantiate algorithms that improve upon previously known rates in the general convex and PL settings, and are near-optimal (via an algorithm-independent lower bound that we show) for problems that satisfy strong convexity. Empirical results support this theoretical gain over existing methods.
翻译:联邦学习(FL)旨在将培训模型的通信复杂性降至最低程度,使其与许多客户之间分布的不同数据相比,培训模式的通信复杂性降到最低。一种共同的方法是当地方法,客户在与服务器(例如FedAvg)沟通之前,对当地数据采取多重优化步骤,而当地方法可以利用客户数据之间的相似性。然而,在现有分析中,这是在依赖通信轮数方面缓慢趋同的代价。另一方面,全球方法,客户在每轮中简单地返回一个梯度矢量(例如SGD),按R值计算速度更快,但即使客户是同质的,客户之间却未能利用相似性。我们提议FedChain,这是一个算法框架,将当地方法的优势和全球方法结合起来,在利用客户之间的相似性的同时,在R方面实现快速趋同性。我们利用FedChain,即快速的算法提高了一般 convex和PL环境中先前已知的速率,并且近为最优化(我们所显示的算法依赖的较低约束度),因为问题满足了强烈的趋同性。Empricalalalalalal结果支持了现有方法。