The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex stochastic settings mostly discuss convergence in terms of the random output and minimum output. To this end, we address the convergence of the last iterate output (called last-iterate convergence) of the stochastic momentum methods for non-convex stochastic optimization problems, in a way conformal with traditional optimization theory. We prove the last-iterate convergence of the stochastic momentum methods under a unified framework, covering both stochastic heavy ball momentum and stochastic Nesterov accelerated gradient momentum. The momentum factors can be fixed to be constant, rather than time-varying coefficients in existing analyses. Finally, the last-iterate convergence of the stochastic momentum methods is verified on the benchmark MNIST and CIFAR-10 datasets.
翻译:光学动力学方法是用来解决人工神经网络中大规模随机优化问题的一种常用加速技术。目前,非凝固型神经神经网络中蒸蒸汽动力学方法的趋同结果大多讨论随机输出和最小输出的趋同。为此,我们处理非凝固型蒸汽动力学方法最后循环输出的趋同(称为最后地缘趋同),其方式与传统优化理论相符。我们证明,在统一的框架下,蒸蒸气动力学方法在包括蒸汽式重球动力和蒸汽式内斯特罗夫加速梯度动力的趋同方法上是最后的趋同。动力系数可以固定为恒定,而不是现有分析中的时间变化系数。最后,在基准的MNIST和CIFAR-10数据集上验证了蒸汽动力学方法的最后速度趋同。