Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart~(Sim-GDA) in many settings. We prove that Alt-GDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the \emph{first} result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate \emph{globally} for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms.
翻译:平滑的小型游戏往往通过同步或交替的梯度更新来进行。 虽然交替更新的算法在实践中通常使用, 但大多数现有的理论分析都侧重于同步算法, 以便于分析。 在本文中, 我们研究微型马克思游戏中交替的梯度下层( ALt- GDA), 并显示 Alt- GDA 在许多设置中优于同时对等的 ~ (Sim- GDA ) 。 我们证明 Alt- GDA 达到接近最佳的本地趋同率, 以强烈凝聚的问题( SC), 而 Sim- GDA 聚集速度要慢得多。 据我们所知, 这是任何显示 Alt- GDA 比 Sim- GDA 聚集速度快于常数的设置的结果 。 我们进一步调整整体四重约束理论( IQC ), 并显示 Alt- GDDA 达到相同的速度\ emph{ globally}, 而 SSC 微轴问题子类中, 我们证明交替更新的GAN 训练速度只帮助同时使用乐观。