With the rapidly growing model complexity and data volume, training deep generative models (DGMs) for better performance has becoming an increasingly more important challenge. Previous research on this problem has mainly focused on improving DGMs by either introducing new objective functions or designing more expressive model architectures. However, such approaches often introduce significantly more computational and/or designing overhead. To resolve such issues, we introduce in this paper a generic framework called {\em generative-model inference} that is capable of enhancing pre-trained GANs effectively and seamlessly in a variety of application scenarios. Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques, instead of re-training or fine-tuning pre-trained model parameters. Extensive experimental results on applications like image generation, image translation, text-to-image generation, image inpainting, and text-guided image editing suggest the effectiveness and superiority of our proposed framework.
翻译:随着模型复杂性和数据量的迅速增长,培训深基因模型以提高业绩已成为越来越重要的挑战。以前关于该问题的研究主要侧重于通过引入新的客观功能或设计更清晰的模型结构来改进数字GM。然而,这类方法往往引入更多的计算和/或设计间接费用。为了解决这些问题,我们在本文件中引入了一个通用框架,称为“基因模型推断”,能够有效和无缝地在各种应用情景中提高预先培训的GANs。我们的基本想法是,利用瓦塞尔斯坦梯度流技术有效地推断特定要求的最佳潜在分布,而不是再培训或微调预先培训的模型参数。关于图像生成、图像翻译、文本到图像生成、图像插入和文本指导图像编辑等应用的广泛实验结果表明了我们拟议框架的有效性和优越性。