Nesterov's accelerated gradient method (NAG) is widely used in problems with machine learning background including deep learning, and is corresponding to a continuous-time differential equation. From this connection, the property of the differential equation and its numerical approximation can be investigated to improve the accelerated gradient method. In this work we present a new improvement of NAG in terms of stability inspired by numerical analysis. We give the precise order of NAG as a numerical approximation of its continuous-time limit and then present a new method with higher order. We show theoretically that our new method is more stable than NAG for large step size. Experiments of matrix completion and handwriting digit recognition demonstrate that the stability of our new method is better. Furthermore, better stability leads to higher computational speed in experiments.
翻译:Nesterov的加速梯度方法(NAG)被广泛用于处理机器学习背景问题,包括深层学习,并且与一个连续时间差异方程式相对应。 从这个联系,可以调查差异方程的属性及其数字近似值,以改进加速梯度方法。在这项工作中,我们展示了在数字分析的启发下,在稳定性方面对NAG的新的改进。我们给出了NAG的精确排序,作为其连续时间限制的数值近似值,然后提出了一个新的更高顺序的方法。我们从理论上表明,我们的新方法在大步尺寸上比NAG更稳定。矩阵完成和笔记数字识别实验表明,我们新方法的稳定性更高。此外,更好的稳定性导致实验的计算速度更高。