Nesterov's accelerated gradient descent (NAG) is one of the milestones in the history of first-order algorithms. It was not successfully uncovered until the high-resolution differential equation framework was proposed in [Shi et al., 2022] that the mechanism behind the acceleration phenomenon is due to the gradient correction term. To deepen our understanding of the high-resolution differential equation framework on the convergence rate, we continue to investigate NAG for the $\mu$-strongly convex function based on the techniques of Lyapunov analysis and phase-space representation in this paper. First, we revisit the proof from the gradient-correction scheme. Similar to [Chen et al., 2022], the straightforward calculation simplifies the proof extremely and enlarges the step size to $s=1/L$ with minor modification. Meanwhile, the way of constructing Lyapunov functions is principled. Furthermore, we also investigate NAG from the implicit-velocity scheme. Due to the difference in the velocity iterates, we find that the Lyapunov function is constructed from the implicit-velocity scheme without the additional term and the calculation of iterative difference becomes simpler. Together with the optimal step size obtained, the high-resolution differential equation framework from the implicit-velocity scheme of NAG is perfect and outperforms the gradient-correction scheme.
翻译:Nesterov的加速梯度下降(NAG)是一阶算法历史上的里程碑之一。 在[Shi 等人,2022] 提出高分辨率差分方程框架之前,没有成功发现加速现象背后的机制是由于梯度校正术语造成的。为了加深我们对关于趋同率的高分辨率差方程框架的理解,我们还在根据Lyapunov分析技巧和本文中阶段空间代表法的方法,继续调查NAG对美元和强力混凝土函数的调查。首先,我们重新审视梯度校正方案的证据。类似[Chen 等人,2022],直截了当的计算使证据非常简单化,并将阶梯度大小扩大至$=1/L$,但稍稍作修改。与此同时,建造Lyapunov函数的方法是有原则的。此外,我们还从暗速计划中调查了NAG。由于速度的差别,我们发现Lyapunov功能是从隐含速度计划中构建出来的,而没有额外的术语,并且从高级平面平面平方平方平面平面公式的计算方法将更简单化。