We combine two advanced ideas widely used in optimization for machine learning: shuffling strategy and momentum technique to develop a novel shuffling gradient-based method with momentum, coined Shuffling Momentum Gradient (SMG), for non-convex finite-sum optimization problems. While our method is inspired by momentum techniques, its update is fundamentally different from existing momentum-based methods. We establish state-of-the-art convergence rates of SMG for any shuffling strategy using either constant or diminishing learning rate under standard assumptions (i.e.$L$-smoothness and bounded variance). When the shuffling strategy is fixed, we develop another new algorithm that is similar to existing momentum methods, and prove the same convergence rates for this algorithm under the $L$-smoothness and bounded gradient assumptions. We demonstrate our algorithms via numerical simulations on standard datasets and compare them with existing shuffling methods. Our tests have shown encouraging performance of the new algorithms.
翻译:我们结合了在优化机器学习中广泛使用的两个先进想法:为开发一种具有动力的新颖的以梯度为基础的摇篮法而进行冲洗的战略和动力技术;为非螺旋微粒增压(SMG),为非螺旋微粒增压(SMG)优化问题创建了冲压式增压式增压法(SMG ) 。虽然我们的方法受动力技术的启发,但其更新与现有的以动力为基础的方法有根本的不同。我们根据标准假设(即$L$-moothity和界限差异),为任何摇篮式增压战略建立最先进的SMG(SMG)趋同率,使用不变或不断降低的学习率(即$-mooth和界限差异 ) 。当摇摆式战略固定下来时,我们开发了另一种与现有动力方法类似的新算法,并证明在美元吸附式和捆绑梯假设下这种算法的趋同的趋同率。我们通过对标准数据集进行数字模拟来展示我们的算法,并将它们与现有的抖法进行比较。我们的测试显示新的算法表现了令人鼓舞的表现。