In this paper, we propose a new approach to train Generative Adversarial Networks (GANs) where we deploy a double-oracle framework using the generator and discriminator oracles. GAN is essentially a two-player zero-sum game between the generator and the discriminator. Training GANs is challenging as a pure Nash equilibrium may not exist and even finding the mixed Nash equilibrium is difficult as GANs have a large-scale strategy space. In DO-GAN, we extend the double oracle framework to GANs. We first generalize the players' strategies as the trained models of generator and discriminator from the best response oracles. We then compute the meta-strategies using a linear program. For scalability of the framework where multiple generators and discriminator best responses are stored in the memory, we propose two solutions: 1) pruning the weakly-dominated players' strategies to keep the oracles from becoming intractable; 2) applying continual learning to retain the previous knowledge of the networks. We apply our framework to established GAN architectures such as vanilla GAN, Deep Convolutional GAN, Spectral Normalization GAN and Stacked GAN. Finally, we conduct experiments on MNIST, CIFAR-10 and CelebA datasets and show that DO-GAN variants have significant improvements in both subjective qualitative evaluation and quantitative metrics, compared with their respective GAN architectures.
翻译:在本文中,我们提出一种新的方法来培训基因反反转网络(GANs),我们利用发电机和制导器来部署一个双手框架。GAN基本上是发电机和制导器之间的双球零和游戏。培训GAN是一个挑战,因为纯粹的纳什平衡可能不存在,甚至发现混合的纳什平衡也是困难的,因为GANs有一个大规模的战略空间。在DO-GAN中,我们把双臂框架扩展至GANs。我们首先将玩家的战略推广到GANs。我们首先将这些玩家的战略推广为由最佳反应器或触角组成的经过训练的发电机和制导师模型。然后,我们用线性程序来计算元战略。为了使多发制器和制导师的最佳反应储存在记忆中的框架具有伸缩性,我们建议了两种解决办法:(1) 调整了以弱势为主的玩家的策略,使神鹰不会变得棘手;(2) 利用不断学习来保留以前对网络的了解。我们的框架适用于已经建立的GAN结构,如Villa GAN,深变式GAN,深的GAN,Spal-ANA,最后将GARA和CRAKAN的模型和CRARAS最后显示重要的GRAAN的GRA进行重大的GRA和CRA和CRARAN的对比。