Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.
翻译:DALL-E和稳定传播等有条件的遗传模型根据用户定义的文字生成图像,即快速。查找和精炼产生理想图像的提示已成为迅速工程的艺术。生成模型不为用户通过提示表达的信息需求提供内在检索模型。根据广泛的文献审查,我们重新定义基因模型的快速工程,作为交互式文本检索,使用新型的“无限索引”。我们第一次在与专家进行的关于为游戏设计制作图像的案例研究中应用这些洞见。最后,我们设想积极学习如何有助于指导生成图像的检索。