This paper studies the accelerated gradient descent for general nonconvex problems under the gradient Lipschitz and Hessian Lipschitz assumptions. We establish that a simple restarted accelerated gradient descent (AGD) finds an $\epsilon$-approximate first-order stationary point in $O(\epsilon^{-7/4})$ gradient computations with simple proofs. Our complexity does not hide any polylogarithmic factors, and thus it improves over the state-of-the-art one by the $O(\log\frac{1}{\epsilon})$ factor. Our simple algorithm only consists of Nesterov's classical AGD and a restart mechanism, and it does not need the negative curvature exploitation or the optimization of regularized surrogate functions. Technically, our simple proof does not invoke the analysis for the strongly convex AGD, which is crucial to remove the $O(\log\frac{1}{\epsilon})$ factor.
翻译:本文研究了Lipschitz 和 Hessian Lipschitz 假设下一般非混凝土问题的加速梯度下降。 我们确定, 简单重现的加速梯度下降( AGD) 在 $O (\\ epsilon\\\ 7/4}) 中, 在 $( epsilon\ 7/4} ) 的梯度计算中, 找到接近于第一阶固定点的 $( epsilon$ ) 。 我们的复杂程度并不隐藏任何多面性因素, 因此它比最先进的因素( $O (\ log\ frac{ 1\ unpsilon} ) 有所改进。 我们简单的算法仅包括 Nesterov 的经典加速梯度下降和重新启动机制, 它不需要负曲线开发或优化正规化代金功能。 从技术上讲, 我们的简单证据并不引用对强烈的 convex AGDDD 的分析, 这对删除 $(\ frac { 1 unsilon) $( ) ) 至关重要 。