This paper is concerned with convergence of stochastic gradient algorithms with momentum terms in the nonconvex setting. A class of stochastic momentum methods, including stochastic gradient descent, heavy ball, and Nesterov's accelerated gradient, is analyzed in a general framework under mild assumptions. Based on the convergence result of expected gradients, we prove the almost sure convergence by a detailed discussion of the effects of momentum and the number of upcrossings. It is worth noting that there are not additional restrictions imposed on the objective function and stepsize. Another improvement over previous results is that the existing Lipschitz condition of the gradient is relaxed into the condition of Holder continuity. As a byproduct, we apply a localization procedure to extend our results to stochastic stepsizes.
翻译:本文涉及随机梯度算法与非电流设置中动力值的趋同问题。 在轻度假设下,在总体框架内分析了一组随机梯度动力学方法,包括随机梯度下降、重球和内斯特罗夫加速梯度。根据预期梯度的趋同结果,我们通过详细讨论动力效应和交错次数,证明几乎可以肯定地趋同。值得指出的是,对客观功能和分级没有附加的限制。与以往相比,另一个改进是,现有的Lipschitz 梯度状况已放松到Holder的连续性状态。作为副产品,我们应用了本地化程序来扩大我们的结果,以进行分级化。