Generative models (e.g., GANs, diffusion models) learn the underlying data distribution in an unsupervised manner. However, many applications of interest require sampling from a particular region of the output space or sampling evenly over a range of characteristics. For efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control over pre-trained generative models by incorporating knowledge of other off-the-shelf models. PromptGen defines control as energy-based models (EBMs) and samples images in a feed-forward manner by approximating the EBM with invertible neural networks, avoiding optimization at inference. Our experiments demonstrate how PromptGen can efficiently sample from several unconditional generative models (e.g., StyleGAN2, StyleNeRF, diffusion autoencoder, NVAE) in a controlled or/and de-biased manner using various off-the-shelf models: (1) with the CLIP model as control, PromptGen can sample images guided by text, (2) with image classifiers as control, PromptGen can de-bias generative models across a set of attributes or attribute combinations, and (3) with inverse graphics models as control, PromptGen can sample images of the same identity in different poses. (4) Finally, PromptGen reveals that the CLIP model shows a "reporting bias" when used as control, and PromptGen can further de-bias this controlled distribution in an iterative manner. The code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
翻译:生成模型(例如,GANs, 扩散模型) 以不受监督的方式学习基本数据分布。 但是,许多感兴趣的应用需要从特定区域对输出空间进行取样,或平均地对一系列特性进行取样。 为了在这些情景中有效取样,我们提出“生成视觉提示(PromptGen) ”,这是通过纳入其他现成模型的知识来对预培训的基因化模型进行分配控制的框架。TerningGen 定义控制为基于能源模型(EBIMs)和以反馈方式对图像进行提取的方式进行控制,通过对EBM和不可逆的神经网络进行近似化,避免推断的优化。我们的实验表明“StyleGAN2”、StyNERF、扩散自动编码、NVAEE)如何有效地从若干无条件的基因化模型中取样(例如StelegGAN2、SmalNERF、扩散自动编码、NVAVAE) 使用各种现成模型进行分配。(1) 以CLIP模式作为控制, SilentG Greal-deal-lictions 将图像作为进一步受控制的图像分类和G-deal-lial-lial-Ls 。