We propose a variant of the standard min-max framework for GANs to learn a distribution, where the discriminator can update its strategy in a greedy manner until it reaches a first-order stationary point. We give an algorithm to train such a GAN and show that it provably converges from any initial point to an approximate local equilibrium for this framework. Our algorithm runs in time polynomial in the smoothness parameters of the loss function and independent of the dimension, and allows for loss functions that can be nonconvex and nonconcave in the parameters of the generator and discriminator. Empirically, GANs trained using our algorithm consistently learn a greater number of modes than gradient descent-ascent (GDA), optimistic mirror descent (OMD), and unrolled GANs when applied to a synthetic Gaussian mixture dataset. Moreover, they perform significantly better on CIFAR-10 than OMD and GDA when comparing the mean and standard deviation of the Inception Score respectively.
翻译:我们为GANs提出了一个标准最小最大框架的变体,让它学习分布,使歧视者可以贪婪地更新其战略,直到达到一阶固定点。我们给出了一种算法来训练这样的GAN,并表明它从任何初始点到大致的当地均衡都可被看得到。我们的算法在损失函数的平稳度参数方面在时间上是多式的,并且独立于这一维度,并允许在生成者和歧视者参数中产生非混凝土和非混凝土的损失功能。从这个角度看,经过我们算法培训的GANs经常学习比梯度血源增益(GDA)、乐观镜底增生(OMD)和在合成高斯混合数据集中应用的无滚动GANs更多的模式。此外,在分别比较感官分数的平均值和标准偏差时,他们在CIFAR-10上的表现比OMD和GDA要好得多。