Text-guided synthesis of images has made a giant leap towards becoming a mainstream phenomenon. With text-to-image generation systems, anybody can create digital images and artworks. This provokes the question of whether text-to-image generation is creative. This paper expounds on the nature of human creativity involved in text-to-image art (so-called "AI art") with a specific focus on the practice of prompt engineering. The paper argues that the current product-centered view of creativity falls short in the context of text-to-image generation. A case exemplifying this shortcoming is provided and the importance of online communities for the creative ecosystem of text-to-image art is highlighted. The paper provides a high-level summary of this online ecosystem drawing on Rhodes' conceptual four P model of creativity. Challenges for evaluating the creativity of text-to-image generation and opportunities for research on text-to-image generation in the field of Human-Computer Interaction (HCI) are discussed.
翻译:以文字为指南的图像合成已经向主流现象迈出了一大步。随着文本到图像的生成系统,任何人都可以创建数字图像和艺术作品。这引起了文本到图像的生成是否具有创造性的问题。本文件阐述了文本到图像艺术(所谓的“AI Art”)中涉及的人类创造力的性质,并具体侧重于迅速工程实践。该文件认为,目前以产品为中心的创造力观点在文本到图像的生成方面是不足的。提供了一个例子,并突出了在线社区对文本到图像艺术创造性生态系统的重要性。本文利用罗得的4 P 概念创意模型,提供了这一在线生态系统的高级摘要。讨论了评估文本到图像生成的创造力的挑战和在人类计算机互动领域研究文本到图像生成的机会。