Federated bilevel optimization has attracted increasing attention due to emerging machine learning and communication applications. The biggest challenge lies in computing the gradient of the upper-level objective function (i.e., hypergradient) in the federated setting due to the nonlinear and distributed construction of a series of global Hessian matrices. In this paper, we propose a novel communication-efficient federated hypergradient estimator via aggregated iterative differentiation (AggITD). AggITD is simple to implement and significantly reduces the communication cost by conducting the federated hypergradient estimation and the lower-level optimization simultaneously. We show that the proposed AggITD-based algorithm achieves the same sample complexity as existing approximate implicit differentiation (AID)-based approaches with much fewer communication rounds in the presence of data heterogeneity. Our results also shed light on the great advantage of ITD over AID in the federated/distributed hypergradient estimation. This differs from the comparison in the non-distributed bilevel optimization, where ITD is less efficient than AID. Our extensive experiments demonstrate the great effectiveness and communication efficiency of the proposed method.
翻译:由于新兴的机器学习和通信应用,联邦双层优化吸引了越来越多的关注。最大的挑战在于由于一系列全球黑森矩阵的不线性和分布性构造,在联邦环境中计算高层目标功能(即高梯度)的梯度(即高梯度)。在本文中,我们提议通过综合迭代差异(AggITD),建立一个新型的通信效率联邦高梯度估计器(AggITD)。AggITD通过同时进行联合高梯度估计和较低水平优化,实施和大幅降低通信成本很简单。我们表明,基于AggITD的拟议算法取得了与现有近似隐含差异(AID)法相同的样本复杂性,在数据多样化的情况下,通信周期要少得多。我们的结果还揭示了ITD在联合/分配高梯度估计中的巨大优势。这不同于非分配双级优化方法的比较,即ITD效率低于AID。我们进行的广泛实验显示了拟议方法的巨大有效性和通信效率。