The text-to-image model Stable Diffusion has recently become very popular. Only weeks after its open source release, millions are experimenting with image generation. This is due to its ease of use, since all it takes is a brief description of the desired image to "prompt" the generative model. Rarely do the images generated for a new prompt immediately meet the user's expectations. Usually, an iterative refinement of the prompt ("prompt engineering") is necessary for satisfying images. As a new perspective, we recast image prompt engineering as interactive image retrieval - on an "infinite index". Thereby, a prompt corresponds to a query and prompt engineering to query refinement. Selected image-prompt pairs allow direct relevance feedback, as the model can modify an image for the refined prompt. This is a form of one-sided interactive retrieval, where the initiative is on the user side, whereas the server side remains stateless. In light of an extensive literature review, we develop these parallels in detail and apply the findings to a case study of a creative search task on such a model. We note that the uncertainty in searching an infinite index is virtually never-ending. We also discuss future research opportunities related to retrieval models specialized for generative models and interactive generative image retrieval. The application of IR technology, such as query reformulation and relevance feedback, will contribute to improved workflows when using generative models, while the notion of an infinite index raises new challenges in IR research.
翻译:文本到图像模型“ 稳定扩散” 近来变得非常流行。 在开放源码发布后仅几个星期, 数百万人就正在实验图像生成。 这是因为它使用方便, 这是因为它很容易被使用。 因为它需要简单描述想要“ 加速” 基因模型的图像。 为新快速生成的图像很少能立即满足用户的期望。 通常, 要满足图像, 需要反复完善快速的图像(“ 快速工程 ” ) 。 作为新视角, 我们重新将图像快速工程作为交互式图像检索 — — “ 无限指数 ” 。 因此, 快速对应查询和快速工程来改进查询。 选中的图像- 快速配对可以直接相关反馈, 因为模型可以修改精细化模型的图像。 这是一种片面互动检索的形式, 在用户方面, 服务器方面仍然没有确定。 根据广泛的文献审查, 我们开发了这些相似之处, 并将这些索引的发现应用到在这种模型上创造性的搜索任务中。 我们注意到, 在搜索一个无限的基因再生化的模型中, 在研究中, 将不确定性作为不断更新的基因再生变现的模型, 的模型的模型 将使得 成为新的基因再生的模型 的模型也 。