Bilevel optimization has been applied to a wide variety of machine learning models, and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of handling the distributed data. To address this issue, under the setting where all participants compose a network and perform peer-to-peer communication in this network, we developed two novel decentralized stochastic bilevel optimization algorithms based on the gradient tracking communication mechanism and two different gradient estimators. Additionally, we established their convergence rates for nonconvex-strongly-convex problems with novel theoretical analysis strategies. To our knowledge, this is the first work achieving these theoretical results. Finally, we applied our algorithms to practical machine learning models, and the experimental results confirmed the efficacy of our algorithms.
翻译:双层优化已经被应用于各种机器学习模型中,并且近年来已经开发了许多随机双层优化算法。然而,大多数现有算法都将其重点放在单机环境下,因此不能处理分布式数据。为了解决这个问题,在所有参与者组成网络并在该网络中执行点对点通信的情况下,我们基于梯度跟踪通信机制和两个不同的梯度估计器开发了两种新的分散式随机双层优化算法。此外,我们针对非凸强凸问题建立了收敛速度的证明,并采用了新的理论分析策略。据我们所知,这是第一篇达到这些理论结果的工作。最后,我们将算法应用于实际的机器学习模型,实验结果证实了我们算法的有效性。