Nonconvex optimization with great demand of fast solvers is ubiquitous in modern machine learning. This paper studies two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) method, for general nonconvex problems under the gradient Lipschitz and Hessian Lipschitz conditions. We establish that the two algorithms find an $\epsilon$-approximate first-order stationary point in $O(\epsilon^{-7/4})$ gradient computations with simple proofs. Our complexity does not hide any polylogarithmic factors, and thus it improves over the state-of-the-art one by the $O(\log\frac{1}{\epsilon})$ factor. Our algorithms are simple in the sense that they only consist of Nesterov's classical AGD or Polyak's HB iterations, as well as a restart mechanism. They do not need the negative curvature exploitation or the minimization of regularized surrogate functions. Our simple proofs only use very elementary analysis, and in contrast with existing analysis, we do not invoke the analysis of the strongly convex AGD or HB.
翻译:在快速解答器的巨大需求下,非混凝土优化在现代机器学习中是无处不在的。本文研究两种简单的加速梯度方法,即重新启动加速梯度下降法(AGD)和重新启动重球法(HB),用于在Lipschitz 和 Hessian Lipschitz 条件下的一般非混凝土问题。我们确定,两种算法在$O(\\ epsilon ⁇ 7/4}) 中发现一个近乎第一阶固定点的$(epsilon$-apopbort), 并用简单的证明来计算。我们的复杂性并不隐藏任何多元性因素,因此它比美元(\\ log\ frac{ 1\\ unpsilon} ) 因素的艺术状态之一有所改进。我们的算法很简单,因为它们只是由Nesterov 的经典的AGDD或Polyak的 HB 迭接合机制组成。我们不需要负面的曲调利用或尽量减少正规化的代谢功能。我们简单的证明只是使用非常简单的分析, 。