Generative models (e.g., GANs and diffusion models) learn the underlying data distribution in an unsupervised manner. However, many applications of interest require sampling from a specific region of the generative model's output space or evenly over a range of characteristics. To allow efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control over pre-trained generative models by incorporating knowledge of arbitrary off-the-shelf models. PromptGen defines control as an energy-based model (EBM) and samples images in a feed-forward manner by approximating the EBM with invertible neural networks, avoiding optimization at inference. We demonstrate how PromptGen can control several generative models (e.g., StyleGAN2, StyleNeRF, diffusion autoencoder, and NVAE) using various off-the-shelf models: (1) with the CLIP model, PromptGen can sample images guided by text, (2) with image classifiers, PromptGen can de-bias generative models across a set of attributes, and (3) with inverse graphics models, PromptGen can sample images of the same identity in different poses. (4) Finally, PromptGen reveals that the CLIP model shows "reporting bias" when used as control, and PromptGen can further de-bias this controlled distribution in an iterative manner. Our code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
翻译:生成模型(例如, GANs 和传播模型) 以不受监督的方式学习基本数据分布; 然而,许多感兴趣的应用需要从特定区域对基因模型输出空间进行取样,或平均地对一系列特性进行取样; 为了允许在这些假设情景中进行有效取样, 我们提议利用各种外部模型, 将任意的现成模型知识纳入培训前现成模型的知识, 用于分配对未经培训的变现模型的控制。 提示Gen 将控制定义为一种基于能源的模型( EBM), 以反馈方式对图像进行取样, 其方法是以不可逆的神经网络接近EBM, 避免最优化的推断。 我们展示了TerminGen如何控制几种基因模型( 例如, StyGAN2, StyNERF, 扩散自动电解码, NVAEE) 使用各种现成模型:(1) 使用CLIP模型, 由文字指导的图像分类, 将模型从一组不可逆向图像显示的G性图像中进行基因分析 。