Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice for many applications (e.g., GAN training), the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (Sim-GDA) in many settings. In particular, we prove that Alt-GDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the \emph{first} result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further prove that the acceleration effect of alternating updates remains when the minimax problem has only strong concavity in the dual variables. Lastly, we adapt the theory of integral quadratic constraints and show that Alt-GDA attains the same rate \emph{globally} for a class of SCSC minimax problems. Numerical experiments on quadratic minimax games validate our claims. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms.
翻译:平滑的小型游戏通常通过同步或交替的梯度更新来进行。 虽然交替更新的算法在很多应用程序(例如GAN培训)中通常使用, 但大部分现有的理论分析都侧重于同时算法, 以便于分析。 在本文中, 我们研究在迷你游戏中交替的梯度下游升( Alt- GDA), 并显示Alt- GDA 在许多环境中优于同时对等( Sim- GDA ) 。 特别是, 我们证明, Alt- GDA 实现了近乎最佳的本地趋同率( SC ), 以强烈的调和强烈的调和( SC SC ) ( SC ) 问题, 而 Sim- GDA 以慢速率聚合。 根据我们的知识, 这是任何设置的结果, Alt- GDA 比 SMA 更快的趋同速度, 我们进一步证明, 当微缩成交更新的问题只在双重变量中具有强烈的共性硬度。 最后, 我们调整了整体的四重度约束理论理论, 并显示, 我们的GDA AL- GDEA 类的升级将达到 AS ASlevill AL AS 。