We combine two advanced ideas widely used in optimization for machine learning: shuffling strategy and momentum technique to develop a novel shuffling gradient-based method with momentum to approximate a stationary point of non-convex finite-sum minimization problems. While our method is inspired by momentum techniques, its update is significantly different from existing momentum-based methods. We establish that our algorithm achieves a state-of-the-art convergence rate for both constant and diminishing learning rates under standard assumptions (i.e., $L$-smoothness and bounded variance). When the shuffling strategy is fixed, we develop another new algorithm that is similar to existing momentum methods. This algorithm covers the single-shuffling and incremental gradient schemes as special cases. We prove the same convergence rate of this algorithm under the $L$-smoothness and bounded gradient assumptions. We demonstrate our algorithms via numerical simulations on standard datasets and compare them with existing shuffling methods. Our tests have shown encouraging performance of the new algorithms.
翻译:我们结合了在优化机器学习中广泛使用的两个先进想法:调整战略和动力技术,以开发一种全新的以梯度为基础的摇篮法,其动力是接近非螺旋有限和最小化问题的固定点。虽然我们的方法受动力技术的启发,但其更新与现有的动力方法有很大不同。我们确定,我们的算法在标准假设(即,美元吸附和边际差异)下,在恒定和不断降低的学习率方面都达到了最先进的趋同率。在调整战略固定下来时,我们开发了另一种与现有动力方法类似的新算法。这一算法将单吸附式和递增梯度方法作为特例包括在内。我们证明,在“$-moooth”和约束梯度假设下,这种算法的趋同率是一样的。我们通过标准数据集的数字模拟来展示我们的算法,并将它们与现有的抖动方法进行比较。我们的测试显示,新的算法的表现令人鼓舞。