We study the asymptotic behavior of second-order algorithms mixing Newton's method and inertial gradient descent in non-convex landscapes. We show that, despite the Newtonian behavior of these methods, they almost always escape strict saddle points. We also evidence the role played by the hyper-parameters of these methods in their qualitative behavior near critical points. The theoretical results are supported by numerical illustrations.
翻译:我们研究了将牛顿的方法和惯性梯度下降混合到非康韦克斯地貌中的二等算法的无症状行为。我们发现,尽管牛顿人有这些方法,但它们几乎总能逃脱严格的马鞍点。我们还证明了这些方法的超参数在接近临界点的定性行为中的作用。理论结果有数字插图支持。