Generative Adversarial Networks are notoriously challenging to train. The underlying minmax optimization is highly susceptible to the variance of the stochastic gradient and the rotational component of the associated game vector field. To tackle these challenges, we propose the Lookahead algorithm for minmax optimization, originally developed for single objective minimization only. The backtracking step of our Lookahead-minmax naturally handles the rotational game dynamics, a property which was identified to be key for enabling gradient ascent descent methods to converge on challenging examples often analyzed in the literature. Moreover, it implicitly handles high variance without using large mini-batches, known to be essential for reaching state of the art performance. Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient, in terms of performance and improved stability, for negligible memory and computational cost. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels, bringing state-of-the-art GAN training within reach of common computational resources.
翻译:光头- 光头- 光头- 光头- 光标的回溯性步骤自然会处理旋转游戏动态, 其属性被确定为使梯度升降方法能够汇集文献经常分析的具有挑战性的例子的关键。 此外,它不使用已知对达到艺术性能状态至关重要的大型微型电池,暗含地处理巨大的差异,而无需使用大型微型电池,这是达到艺术性能状态的已知必要条件。 为应对这些挑战,我们提议了最初为单一目标最小化而开发的光头软体优化计算法。 我们的 Lookahead 算法最初只针对单一目标最小化的最小化法, 自然地处理旋转式游戏动态的回溯性步骤, 这一属性被确定为使梯度升降法方法能够汇集到文献中经常分析的具有挑战性的实例的关键。 此外,它不使用普通的GAAN 10( CIFAR- 10 10), 使用普通的GA( 12.19) 计算工具,在不使用普通的GA( GA) 10) 上达到普通的GAFID 。