Acceleration and momentum are the de facto standard in modern applications of machine learning and optimization, yet the bulk of the work on implicit regularization focuses instead on unaccelerated methods. In this paper, we study the statistical risk of the iterates generated by Nesterov's accelerated gradient method and Polyak's heavy ball method, when applied to least squares regression, drawing several connections to explicit penalization. We carry out our analyses in continuous-time, allowing us to make sharper statements than in prior work, and revealing complex interactions between early stopping, stability, and the curvature of the loss function.
翻译:加速和势头是现代机器学习和优化应用中事实上的标准,然而,关于隐性正规化的大部分工作侧重于未加速的方法。 在本文中,我们研究了Nesterov的加速梯度法和Polyak的重球法所产生的迭代的统计风险,这些迭代在应用到最小的平方回归时,与明确的惩罚性有几处联系。我们连续不断地进行分析,使我们能够比以前的工作更清晰地陈述,并揭示早期停止、稳定和损失功能曲线之间的复杂互动。