There are much recent interests in solving noncovnex min-max optimization problems due to its broad applications in many areas including machine learning, networked resource allocations, and distributed optimization. Perhaps, the most popular first-order method in solving min-max optimization is the so-called simultaneous (or single-loop) gradient descent-ascent algorithm due to its simplicity in implementation. However, theoretical guarantees on the convergence of this algorithm is very sparse since it can diverge even in a simple bilinear problem. In this paper, our focus is to characterize the finite-time performance (or convergence rates) of the continuous-time variant of simultaneous gradient descent-ascent algorithm. In particular, we derive the rates of convergence of this method under a number of different conditions on the underlying objective function, namely, two-sided Polyak-L ojasiewicz (PL), one-sided PL, nonconvex-strongly concave, and strongly convex-nonconcave conditions. Our convergence results improve the ones in prior works under the same conditions of objective functions. The key idea in our analysis is to use the classic singular perturbation theory and coupling Lyapunov functions to address the time-scale difference and interactions between the gradient descent and ascent dynamics. Our results on the behavior of continuous-time algorithm may be used to enhance the convergence properties of its discrete-time counterpart.
翻译:最近,由于在机器学习、网络化资源分配和分布优化等许多领域应用广泛,人们对解决非covnex最小最大优化问题的兴趣很大。也许,解决最小最大优化的最受欢迎的第一阶方法是所谓的同时(或单环)梯度下降偏移法,因为其实施简便。然而,关于这种算法趋同的理论保障非常稀少,因为它即使在简单的双线问题中也可能存在差异。在本文中,我们的重点是说明同时使用梯度下降率算法的时际变异(或趋同率)的特性。特别是,我们在一些不同条件下,在基本目标函数上,即双向的Polica-L ojasiewicz(PL)、一面的PL、非convex强的凝固以及强烈的onvex-nonconcable条件,我们的分析重点是将这一方法的趋同率化速度速度速度速度速度(或趋同于客观功能),我们分析中的关键思想是使用典型的螺旋递增式递进式递化理论和螺旋的递化,作为我们使用的周期性递化的递化的递进式演变。我们使用的递化理论和螺旋性变变变的变的演化结果,可以用来加强我们的递化到不断变相。