The gradient descent-ascent (GDA) algorithm has been widely applied to solve nonconvex minimax optimization problems. However, the existing GDA-type algorithms can only find first-order stationary points of the envelope function of nonconvex minimax optimization problems, which does not rule out the possibility to get stuck at suboptimal saddle points. In this paper, we develop Cubic-GDA -- the first GDA-type algorithm for escaping strict saddle points in nonconvex-strongly-concave minimax optimization. Specifically, the algorithm uses gradient ascent to estimate the second-order information of the minimax objective function, and it leverages the cubic regularization technique to efficiently escape the strict saddle points. Under standard smoothness assumptions on the objective function, we show that Cubic-GDA admits an intrinsic potential function whose value monotonically decreases in the minimax optimization process. Such a property leads to a desired global convergence of Cubic-GDA to a second-order stationary point at a sublinear rate. Moreover, we analyze the convergence rate of Cubic-GDA in the full spectrum of a gradient dominant-type nonconvex geometry. Our result shows that Cubic-GDA achieves an orderwise faster convergence rate than the standard GDA for a wide spectrum of gradient dominant geometry. Our study bridges minimax optimization with second-order optimization and may inspire new developments along this direction.
翻译:梯度下坡率( GDA) 算法已被广泛用于解决非convex 迷你马克思优化问题。 然而,现有的 GDA 型算法只能找到非convex 迷你最大优化问题信封功能的第一阶固定点, 这并不排除被困在亚最佳垫位的可能性。 在本文中, 我们开发了 Cubic- GDA -- -- 首个在非conex- 强力凝聚小马克思最优化中逃避严格坐垫点的GDA型算法。 具体地说, 该算法使用梯度作为估算迷你马克思目标功能第二阶级信息的中心, 并且它利用立方整流技术有效摆脱严格的马鞍点。 在标准平稳假设下, Cubic- GDA 接受一个内在潜在功能, 其价值在微缩缩缩微马克思优化过程中单核下降。 这样的属性可以使Cbic- GDA 的第二次全球恋情调趋近一线率。 此外, 我们分析了Cbuc- GGDA- dalimalimalimalimalimalimalder Adalder Aslock a flax 全面显示我们一个不甚甚高的地平级的底压结果。