Promptify: 通过大型语言模型进行交互式提醒探索的文本到图像生成 (Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models)

Text-to-image generative models have demonstrated remarkable capabilities in generating high-quality images based on textual prompts. However, crafting prompts that accurately capture the user's creative intent remains challenging. It often involves laborious trial-and-error procedures to ensure that the model interprets the prompts in alignment with the user's intention. To address the challenges, we present Promptify, an interactive system that supports prompt exploration and refinement for text-to-image generative models. Promptify utilizes a suggestion engine powered by large language models to help users quickly explore and craft diverse prompts. Our interface allows users to organize the generated images flexibly, and based on their preferences, Promptify suggests potential changes to the original prompt. This feedback loop enables users to iteratively refine their prompts and enhance desired features while avoiding unwanted ones. Our user study shows that Promptify effectively facilitates the text-to-image workflow and outperforms an existing baseline tool widely used for text-to-image generation.

翻译：文本到图像生成模型已经展示出在基于文本提示生成高质量图片方面的显著能力，但是有效制定能捕捉到用户创造性意图的提醒仍然是有挑战性的。通常需要进行繁琐的试错过程，以确保模型与用户意图一致。为了解决这些挑战，我们展示了Promptify，这是一个交互式系统，支持针对文本到图像生成模型的提醒的探索和细化。Promptify利用一个由大型语言模型驱动的建议引擎，帮助用户快速探索和制定多样的提示。我们的界面允许用户灵活地组织生成的图像，并基于他们的偏好，Promptify建议对原始提示进行潜在的更改。这种反馈循环使用户能够迭代地细化提示并增强所需的特性，同时避免不必要的特性。我们的用户研究表明，Promptify有效地促进了文本到图像的工作流程，并优于广泛用于文本到图像生成的基线工具。