Federated Learning is a popular distributed learning paradigm in machine learning. Meanwhile, composition optimization is an effective hierarchical learning model, which appears in many machine learning applications such as meta learning and robust learning. More recently, although a few federated composition optimization algorithms have been proposed, they still suffer from high sample and communication complexities. In the paper, thus, we propose a class of faster federated compositional optimization algorithms (i.e., MFCGD and AdaMFCGD) to solve the nonconvex distributed composition problems, which builds on the momentum-based variance reduced and local-SGD techniques. In particular, our adaptive algorithm (i.e., AdaMFCGD) uses a unified adaptive matrix to flexibly incorporate various adaptive learning rates. Moreover, we provide a solid theoretical analysis for our algorithms under non-i.i.d. setting, and prove our algorithms obtain a lower sample and communication complexities simultaneously than the existing federated compositional algorithms. Specifically, our algorithms obtain lower sample complexity of $\tilde{O}(\epsilon^{-3})$ with lower communication complexity of $\tilde{O}(\epsilon^{-2})$ in finding an $\epsilon$-stationary solution. We conduct the numerical experiments on robust federated learning and distributed meta learning tasks to demonstrate the efficiency of our algorithms.
翻译:联邦学习是机器学习中流行的分布式学习范例。 同时,组合优化是一种有效的分层学习模型,出现在许多机器学习应用中,例如元学习和鲁棒学习。 最近,尽管已经提出了一些联邦组合优化算法,但它们仍然受到高样本和通信复杂性的困扰。 因此,在本文中,我们提出了一类更快的联邦组合优化算法(即MFCGD和AdaMFCGD),以解决非凸分布式组合问题,该算法建立在基于动量的方差减少和本地-SGD技术之上。 特别地,我们的自适应算法(即AdaMFCGD)使用统一的自适应矩阵,以灵活地结合各种自适应学习率。此外,我们为我们的算法提供了在非独立同分布设置下的坚实理论分析,并证明了我们的算法同时获得了较低的样本和通信复杂度,比现有的联邦组合算法更低。 具体而言,在找到$\epsilon$- 稳定解的情况下,我们的算法同时获得了低样本复杂度$\tilde{O}(\epsilon^{-3})$和低通信复杂度 $\tilde{O}(\epsilon^{-2})$。我们在鲁棒联邦学习和分布式元学习任务上进行数值实验,以证明我们算法的效率。