We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps. This stands in sharp contrast to the bounds in the quadratic case, but also to known results for non-accelerated gradient methods where stability typically grows linearly with the number of steps.
翻译:我们研究了Nesterov加速梯度方法的算法稳定性。对于二次曲线目标,Chen等人(2018年)证明,该方法的统一稳定性随着优化步骤的数量而增长。我们推测,对于一般曲线和平稳的情况也是如此。我们否定了这一推测,并表明,对于两种算法稳定性概念(包括统一稳定性),Nesterov加速方法的稳定性事实上随着梯度步骤的数量而迅速恶化。这与二次曲线的界限形成鲜明对比,但也表明,在非加速梯度方法方面,其稳定性通常随着步骤的数量而直线增长。