分析研究以动力为基础的速度方法研究 (Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems)

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods. Despite these algorithms being widely applied to arbitrary loss functions, their behaviour in generically non-convex, high dimensional landscapes is poorly understood. In this work, we use dynamical mean field theory techniques to describe analytically the average dynamics of these methods in a prototypical non-convex model: the (spiked) matrix-tensor model. We derive a closed set of equations that describe the behaviour of heavy-ball momentum and Nesterov acceleration in the infinite dimensional limit. By numerical integration of these equations, we observe that these methods speed up the dynamics but do not improve the algorithmic threshold with respect to gradient descent in the spiked model.

翻译：许多机器学习问题的优化步骤很少依赖于香草梯度的下降,但使用加速动力法是常见的做法。尽管这些算法被广泛应用于任意损失功能,但它们在一般非曲线、高维景观中的行为却鲜为人知。在这项工作中,我们使用动态的、中等的实地理论技术,用一种原型的非曲线模型来分析这些方法的平均动态:(喷射的)矩阵电流模型。我们得出一套封闭式方程式,描述重球动力和Nesterov加速在无限的维度限制中的行为。通过这些方程式的数值整合,我们观察到这些方法加快了动态,但没有改善在加压模型中梯度下降的算法阈值。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日