Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems. Although recent studies have characterized the convergence rate for several such popular algorithms, it is still unclear how much further these convergence rates can be improved. In this paper, we address this fundamental question from two perspectives. First, we provide the first-known lower complexity bounds of $\widetilde{\Omega}(\frac{1}{\sqrt{\mu_x}\mu_y})$ and $\widetilde \Omega\big(\frac{1}{\sqrt{\epsilon}}\min\{\frac{1}{\mu_y},\frac{1}{\sqrt{\epsilon^{3}}}\}\big)$ respectively for strongly-convex-strongly-convex and convex-strongly-convex bilevel optimizations. Second, we propose an accelerated bilevel optimizer named AccBiO, whose complexity improves the existing upper bounds orderwisely under strongly-convex-strongly-convex, convex-strongly-convex and nonconvex-strongly-convex geometries. We further show that AccBiO achieves the optimal results (i.e., the upper and lower bounds match) under certain conditions up to logarithmic factors. Interestingly, our lower bounds under both geometries are larger than the corresponding optimal complexities of minimax optimization, establishing that bilevel optimization is provably more challenging than minimax optimization. We finally discuss the extensions and applications of our results to other problems such as minimax optimization.
翻译:最近,由于在现代机器学习问题中的广泛应用,双层优化最近吸引了越来越多的兴趣。尽管最近的研究对若干这种流行算法的趋同率作了特征的描述,但仍然不清楚这些趋同率还能进一步改进多少。在本文件中,我们从两个角度分别解决这个基本问题。首先,我们提供了第一个已知的低复杂度的 $- 宽度=Omega}(\ frac{1\ sqrt}1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ x\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\