In this work, we analyze the global convergence property of coordinate gradient descent with random choice of coordinates and stepsizes for non-convex optimization problems. Under generic assumptions, we prove that the algorithm iterate will almost surely escape strict saddle points of the objective function. As a result, the algorithm is guaranteed to converge to local minima if all saddle points are strict. Our proof is based on viewing coordinate descent algorithm as a nonlinear random dynamical system and a quantitative finite block analysis of its linearization around saddle points.
翻译:在这项工作中,我们分析了协调梯度下降的全球趋同属性,并随机选择坐标和步骤,以应对非曲线优化问题。根据通用假设,我们证明算法变换将几乎必然摆脱目标函数的严格马鞍点。因此,如果所有马鞍点都严格,算法将保证与当地小型算法趋同。我们的证据是基于将协调下降算法看成非线性随机动态系统,以及对它围绕马鞍点的线性化进行数量有限的区块分析。