As a novel distributed learning paradigm, federated learning (FL) faces serious challenges in dealing with massive clients with heterogeneous data distribution and computation and communication resources. Various client-variance-reduction schemes and client sampling strategies have been respectively introduced to improve the robustness of FL. Among others, primal-dual algorithms such as the alternating direction of method multipliers (ADMM) have been found being resilient to data distribution and outperform most of the primal-only FL algorithms. However, the reason behind remains a mystery still. In this paper, we firstly reveal the fact that the federated ADMM is essentially a client-variance-reduced algorithm. While this explains the inherent robustness of federated ADMM, the vanilla version of it lacks the ability to be adaptive to the degree of client heterogeneity. Besides, the global model at the server under client sampling is biased which slows down the practical convergence. To go beyond ADMM, we propose a novel primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model. In addition, FedVRA unifies several representative FL algorithms in the sense that they are either special instances of FedVRA or are close to it. Extensions of FedVRA to semi/un-supervised learning are also presented. Experiments based on (semi-)supervised image classification tasks demonstrate superiority of FedVRA over the existing schemes in learning scenarios with massive heterogeneous clients and client sampling.
翻译:作为新颖的分布式学习模式,联合会学习(FL)在与数据分布和计算以及通信资源各不相同的大型客户打交道时面临严峻挑战。我们首先发现,采用各种客户差异减少计划和客户抽样战略,分别是为了提高FL的稳健性。除其他外,发现方法乘数交替方向(ADMM)等原始双重算法具有适应性,并优于大多数原始FL算法。然而,背后的原因仍然是一个谜。在本文中,我们首先揭示了以下事实:即已经结成的ADMM(FM)基本上是一种客户差异减少算法。这解释了FMM(AMM)的内在稳健健性减少计划和客户抽样战略,而其香草版则缺乏适应客户偏差程度的能力。此外,在客户抽样分析中,全球服务器的全球模型模式有偏差,我们提出了一个新的初等FLLA(FVRA)分类法的原始初级算法,我们首先揭示了一个适应性控制差异减少水平水平的客户,而FRA(FRA)的快速缩缩缩缩缩算法则是FRA。