Advances in multimodal AI have presented people with powerful ways to create images from text. Recent work has shown that text-to-image generations are able to represent a broad range of subjects and artistic styles. However, finding the right visual language for text prompts is difficult. In this paper, we address this challenge with Opal, a system that produces text-to-image generations for news illustration. Given an article, Opal guides users through a structured search for visual concepts and provides a pipeline allowing users to generate illustrations based on an article's tone, keywords, and related artistic styles. Our evaluation shows that Opal efficiently generates diverse sets of news illustrations, visual assets, and concept ideas. Users with Opal generated two times more usable results than users without. We discuss how structured exploration can help users better understand the capabilities of human AI co-creative systems.
翻译:多式大赦国际的进步为人们提供了从文字中产生图像的有力方法。最近的工作表明,文本到图像的几代人能够代表广泛的主题和艺术风格。然而,为文本提示找到正确的视觉语言是困难的。在本文中,我们用Opal来应对这一挑战,Opal是一个为新闻插图制作文字到图像的几代人的系统。根据一篇文章,Opal通过结构化搜索视觉概念来引导用户,并提供一条管道,让用户能够根据文章的语调、关键词和相关艺术风格生成插图。我们的评估表明,Opal高效地生成了多种新闻插图、视觉资产和概念理念。拥有Opal的用户产生的可使用结果比不使用用户多两倍。我们讨论了结构化的探索如何帮助用户更好地了解人类人工合成系统的能力。