Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum, which can be interpreted as a randomized Kaczmarz algorithm with momentum, retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size is used. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical experiments to demonstrate that our bounds are reasonably sharp.
翻译:在机械学习优化中广泛使用简单的模拟动力学方法,但其良好的实际性能与文献中缺乏加速理论保证不相符。在这项工作中,我们的目标是缩小理论和实践之间的差距,通过显示可被解释为具有动力的随机卡茨马兹算法的随机型重球动力,保持四极优化问题(确定性)重球快速线性速度,至少是在使用足够大批量的微型连接时是如此。 分析依赖于仔细分解动力转换矩阵,并使用独立随机矩阵产品的新的光谱规范浓度界限。我们提供数字实验,以证明我们的界限是相当直的。