In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set $\mathcal{C}$. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set $\mathcal{C}$ is simple for a quadratic objective function. Specifically, our results hold if one can find a $\rho$-approximate solution of a quadratic program subject to $\mathcal{C}$ in polynomial time, where $\rho<1$ is a positive constant that depends on the structure of the set $\mathcal{C}$. Under this condition, we show that the sequence of iterates generated by the proposed framework reaches an $(\epsilon,\gamma)$-second order stationary point (SOSP) in at most $\mathcal{O}(\max\{\epsilon^{-2},\rho^{-3}\gamma^{-3}\})$ iterations. We further characterize the overall complexity of reaching an SOSP when the convex set $\mathcal{C}$ can be written as a set of quadratic constraints and the objective function Hessian has a specific structure over the convex set $\mathcal{C}$. Finally, we extend our results to the stochastic setting and characterize the number of stochastic gradient and Hessian evaluations to reach an $(\epsilon,\gamma)$-SOSP.
翻译:在本文中, 我们研究如何以平滑的非convex优化化问题的方式从马鞍点中解脱出来。 我们建议了一个通用框架, 如果 convex 设置 $\ mathcal{C} 美元对于二次目标功能来说很简单, 那么这个框架的顺序就会简单。 具体地说, 我们的结果可以维持, 如果人们能找到 $\ rho$- 近似解决方案, 在多边时间, $\ mathcal{ C} $, $\\ rho < 1$ 是正常数, 取决于 设置 $\ mathcal{ C} 的结构。 在此条件下, 我们显示, 由拟议框架生成的折叠序在 $( epsilon,\ gamma) 美元( SOS) 最多在 mathcal { O} ( max ) 中, 将 3\\\\ gamma_ 3_ 美元( 3} 美元) 。 我们进一步定义SOS a caldealtical ral ral ral ma ral ral ral ral ral) 设置一个特定的Secal 。