Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDAdynamics as a fixed-point iteration and solves it using Anderson Mixing to con-verge to the local minimax. It addresses the diverging issue of simultaneous GDAand accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AMsolves a variety of minimax problems and improves GAN training on several datasets
翻译:许多现代机器学习算法,如基因对抗网络(GANs)和对抗性培训等现代机器学习算法,可以形成小型最大优化。 渐降率(GDA)由于其简单性,是最常用的算法。 然而,GDA可以与非最优小型算法汇合。 我们提出一个新的微缩算法优化框架(GDA-AM),将GDA动力学视为固定点循环,并用Anderson Mixing和当地微型算法连接解决它。它解决了同步的GDAand加速交替性GDA的趋同问题。我们从理论上表明,这种算法可以在温和条件下实现双线问题的全球趋同。 我们还从经验上表明,GDA-AMsollve系统存在各种微型算法问题,并改进关于若干数据集的GAN培训。