While personalization in distributed learning has been extensively studied, existing approaches employ dedicated algorithms to optimize their specific type of parameters (e.g., client clusters or model interpolation weights), making it difficult to simultaneously optimize different types of parameters to yield better performance. Moreover, their algorithms require centralized or static undirected communication networks, which can be vulnerable to center-point failures or deadlocks. This study proposes optimizing various types of parameters using a single algorithm that runs on more practical communication environments. First, we propose a gradient-based bilevel optimization that reduces most personalization approaches to the optimization of client-wise hyperparameters. Second, we propose a decentralized algorithm to estimate gradients with respect to the hyperparameters, which can run even on stochastic and directed communication networks. Our empirical results demonstrated that the gradient-based bilevel optimization enabled combining existing personalization approaches which led to state-of-the-art performance, confirming it can perform on multiple simulated communication environments including a stochastic and directed network.
翻译:虽然对分布式学习中的个性化进行了广泛研究,但现有方法采用专门的算法优化其具体类型的参数(如客户群或模型内插权重),因此难以同时优化不同类型的参数,以取得更好的性能。此外,它们的算法需要集中或静态的、无方向的通信网络,这些网络可能易受中点失败或僵局的影响。本研究提议利用一种在更实际的通信环境中运行的单一算法优化各种类型的参数。首先,我们建议采用一种基于梯度的双级优化,以减少最有利于优化客户的超参数的个性化方法。第二,我们建议采用一种分散的算法,以估计超参数的梯度,这种梯度甚至可以运行在随机和定向通信网络上。我们的实证结果表明,基于梯度的双级优化能够将现有的个性化方法结合起来,从而导致最先进的性能,确认它可以在多个模拟的通信环境中运行,包括一个随机和定向网络。