Federated Averaging (FedAvg), also known as Local SGD, is one of the most popular algorithms in Federated Learning (FL). Despite its simplicity and popularity, the convergence rate of FedAvg has thus far been undetermined. Even under the simplest assumptions (convex, smooth, homogeneous, and bounded covariance), the best-known upper and lower bounds do not match, and it is not clear whether the existing analysis captures the capacity of the algorithm. In this work, we first resolve this question by providing a lower bound for FedAvg that matches the existing upper bound, which shows the existing FedAvg upper bound analysis is not improvable. Additionally, we establish a lower bound in a heterogeneous setting that nearly matches the existing upper bound. While our lower bounds show the limitations of FedAvg, under an additional assumption of third-order smoothness, we prove more optimistic state-of-the-art convergence results in both convex and non-convex settings. Our analysis stems from a notion we call iterate bias, which is defined by the deviation of the expectation of the SGD trajectory from the noiseless gradient descent trajectory with the same initialization. We prove novel sharp bounds on this quantity, and show intuitively how to analyze this quantity from a Stochastic Differential Equation (SDE) perspective.
翻译:FedAvg (FedAvg) 是FedAvg(FedAvg) 最受欢迎的算法之一。 尽管FedAvg(FedAvg) 的合并率比较简单且广受欢迎, 至今尚未确定。 即使在最简单的假设( convex, 平滑, 均匀, 均匀, 和受约束的共差) 之下, 最著名的上下界( FedAvg ) 并不匹配, 也不清楚现有分析是否捕捉了算法的能力。 在这项工作中, 我们首先为FedAvg( FedAvg) 提供了更低的连接线, 因为它与现有的FedAvg 上界( FedAvg) 的上限分析不易。 此外, 我们更低的界限显示FedAvgg( FedAvg) 的局限性, 在第三阶梯度假设下, 我们证明目前最乐观的趋近的趋同状态- 。 我们的分析来自一种概念, 我们称之为“ 偏差偏差偏差偏差的偏差, ”, 由这种正态的SGDGDRI 显示这种渐变压的轨距如何显示SDRI 。