Bilevel optimization is a popular hierarchical model in machine learning, and has been widely applied to many machine learning tasks such as meta learning, hyperparameter learning and policy optimization. Although many bilevel optimization algorithms recently have been developed, few adaptive algorithm focuses on the bilevel optimization under the distributed setting. It is well known that the adaptive gradient methods show superior performances on both distributed and non-distributed optimization. In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems, where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex. Specifically, our AdaFBiO algorithm builds on the momentum-based variance reduced technique and local-SGD to obtain the best known sample and communication complexities simultaneously. In particular, our AdaFBiO algorithm uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to update variables in both UL and LL problems. Moreover, we provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample complexity of $\tilde{O}(\epsilon^{-3})$ with communication complexity of $\tilde{O}(\epsilon^{-2})$ to obtain an $\epsilon$-stationary point. Experimental results on federated hyper-representation learning and federated data hyper-cleaning tasks verify efficiency of our algorithm.
翻译:双层优化是机器学习中流行的等级模式,并被广泛应用于许多机器学习任务,如元学习、超参数学习和政策优化。尽管最近开发了许多双级优化算法,但很少有适应性算法侧重于分布式环境下的双层优化。众所周知,适应性梯度方法在分布式和非分布式优化上都表现出优异的性能。因此,在论文中,我们提出了一种新的适应性适应性联合双级优化算法(即,AdaFBiO),以解决分布式双级优化问题,即高级(UL)问题的目标功能可能是非康维x,而低级(LL)问题则是很强的康维x。具体地说,我们的AdaFBiO的算法建立在基于动力的变异性降低技术和本地SGD上,以同时获得已知的最佳样本和通信复杂性。特别是我们的AdaFBiO的算法使用统一调制矩阵,将各种适应性学习率纳入UL和LL问题的变量更新。此外,我们为我们的AdFBIO-3级复杂程度的学习数据样本提供了我们Adrialal-lus-lusxxxxxxxxxxxxxxxxxxxxxlalxxxxxxxxxxxx。