生成文本到图像生成的快速调动器分类学 (A Taxonomy of Prompt Modifiers for Text-To-Image Generation)

Text-to-image generation has seen an explosion of interest since 2021. Today, beautiful and intriguing digital images and artworks can be synthesized from textual inputs ("prompts") with deep generative models. Online communities around text-to-image generation and AI generated art have quickly emerged. This paper identifies six types of prompt modifiers used by practitioners in the online community based on a 3-month ethnographic study. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practice of text-to-image generation, but may also help practitioners of AI generated art improve their images. We further outline how prompt modifiers are applied in the practice of "prompt engineering." We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction (HCI). The paper concludes with a discussion of broader implications of prompt engineering from the perspective of Human-AI Interaction (HAI) in future applications beyond the use case of text-to-image generation and AI generated art.

翻译：自2021年以来,文本到图像的生成引起了人们的极大兴趣。今天,美丽和令人感兴趣的数字图像和艺术作品可以通过具有深层基因模型的文本投入(“即时”)合成。围绕文本到图像的生成和AI产生的艺术的在线社区迅速出现。本文件根据为期3个月的人类和人文研究,确定了在线社区从业人员使用的6种迅速修改者。即时修改者的新分类为研究人员提供了一个调查文本到图像生成做法的概念起点,但也可能有助于AI的实践者改进他们的艺术。我们进一步概述了在“即时工程”实践中如何迅速应用修改者。我们讨论了人类-计算机互动领域这一创新创新做法的研究机会。本文件最后从人类-AI互动(HAI互动)的角度讨论了迅速工程在未来应用的更广泛影响,而不只是使用文本到图像生成和AI生成的艺术。