This paper proposes a new algorithm -- the Momentum-assisted Single-timescale Stochastic Approximation (MSTSA) -- for tackling unconstrained bilevel optimization problems. We focus on bilevel problems where the lower level subproblem is strongly-convex. Unlike prior works which rely on two timescale or double loop techniques that track the optimal solution to the lower level subproblem, we design a stochastic momentum assisted gradient estimator for the upper level subproblem's updates. The latter allows us to gradually control the error in stochastic gradient updates due to inaccurate solution to the lower level subproblem. We show that if the upper objective function is smooth but possibly non-convex (resp. strongly-convex), MSTSA requires $\mathcal{O}(\epsilon^{-2})$ (resp. $\mathcal{O}(\epsilon^{-1})$) iterations (each using constant samples) to find an $\epsilon$-stationary (resp. $\epsilon$-optimal) solution. This achieves the best-known guarantees for stochastic bilevel problems. We validate our theoretical results by showing the efficiency of the MSTSA algorithm on hyperparameter optimization and data hyper-cleaning problems.
翻译:本文建议一种新的算法 -- -- 由运动辅助的单一时间尺度软体缩放缩放缩放缩放缩放缩放缩放缩放, 用于解决不受限制的双级优化问题。 我们关注低层次子问题为强固化的双级问题。 与以前依靠两种时间尺度或双圈技术来跟踪低层次子问题的最佳解决方案的工程不同, 我们设计了一个用于上层次子问题更新的随机振荡动动动势辅助梯度估计仪。 后者允许我们逐渐控制由于低层次子问题溶解不准确而导致的振动梯度更新错误。 我们显示, 如果上级目标功能平滑, 但可能非凝固( 强固) 的双级问题。 MSTSA 需要$\ mathcal{O} (\ epsilon% 2} 美元( resprescraital cal) policalalal- supligialalalalalal- probilation maisal- probilizational- prestialal- promaisal maisal- prestialismissional romabilal romaismissutionalism)