In the paper, we study a class of nonconvex nonconcave minimax optimization problems (i.e., $\min_x\max_y f(x,y)$), where $f(x,y)$ is possible nonconvex in $x$, and it is nonconcave and satisfies the Polyak-Lojasiewicz (PL) condition in $y$. Moreover, we propose a class of enhanced momentum-based gradient descent ascent methods (i.e., MSGDA and AdaMSGDA) to solve these stochastic Nonconvex-PL minimax problems. In particular, our AdaMSGDA algorithm can use various adaptive learning rates in updating the variables $x$ and $y$ without relying on any global and coordinate-wise adaptive learning rates. Theoretically, we present an effective convergence analysis framework for our methods. Specifically, we prove that our MSGDA and AdaMSGDA methods have the best known sample (gradient) complexity of $O(\epsilon^{-3})$ only requiring one sample at each loop in finding an $\epsilon$-stationary solution (i.e., $\mathbb{E}\|\nabla F(x)\|\leq \epsilon$, where $F(x)=\max_y f(x,y)$). This manuscript commemorates the mathematician Boris Polyak (1935-2023).
翻译:在论文中,我们研究了一组非convex非conceve 小型马克思优化问题(即$$_x\max_y f(x,y)$),其中美元(x,y)是可能的,美元(x,y)是可能的,美元(x,y)是非convex(x,y)美元(Polyak-Lojasiewicz(PL))条件($)。此外,我们建议了一类基于动力的梯度下降增益方法(即,MSDA和AdaMSGDA),以解决这些非Confex-PL微型马克思问题。特别是,我们的AdaMSGDA算法可以使用各种适应性学习率更新变量(x,y) 美元(x,y) 美元(x) 美元(x) 和AdaMSGDA方法(A) 最有名的样本(Sqrient) 复杂度(eO(pislon,_3) 美元)。</s>