In creativity support and computational co-creativity contexts, the task of discovering appropriate prompts for use with text-to-image generative models remains difficult. In many cases the creator wishes to evoke a certain impression with the image, but the task of conferring that succinctly in a text prompt poses a challenge: affective language is nuanced, complex, and model-specific. In this work we introduce a method for generating images conditioned on desired affect, quantified using a psychometrically validated three-component approach, that can be combined with conditioning on text descriptions. We first train a neural network for estimating the affect content of text and images from semantic embeddings, and then demonstrate how this can be used to exert control over a variety of generative models. We show examples of how affect modifies the outputs, provide quantitative and qualitative analysis of its capabilities, and discuss possible extensions and use cases.
翻译:在创造性支持和计算共孔环境中,发现与文本到图像的基因化模型使用的适当提示的任务仍很困难。在许多情况下,创作者希望对图像产生某种印象,但简洁地在文本提示中赋予这种印象的任务则构成挑战:感官语言有细微差别,复杂,且有模型特点。在这项工作中,我们引入了一种产生以预期影响为条件的图像的方法,使用经精神测定验证的三部分方法进行量化,同时对文本描述加以调整。我们首先培训神经网络,以估计文字和图像从语义嵌入中的影响内容,然后展示如何利用神经网络来控制各种基因化模型。我们举例说明了如何影响产出,提供其能力的定量和定性分析,并讨论可能的扩展和使用案例。