Multimodal AI advancements have presented people with powerful ways to create images from text. Recent work has shown that text-to-image generations are able to represent a broad range of subjects and artistic styles. However, translating text prompts into visual messages is difficult. In this paper, we address this challenge with Opal, a system that produces text-to-image generations for editorial illustration. Given an article text, Opal guides users through a structured search for visual concepts and provides pipelines allowing users to illustrate based on an article's tone, subjects, and intended illustration style. Our evaluation shows that Opal efficiently generates diverse sets of editorial illustrations, graphic assets, and concept ideas. Users with Opal were more efficient at generation and generated over two times more usable results than users without. We conclude on a discussion of how structured and rapid exploration can help users better understand the capabilities of human AI co-creative systems.
翻译:多式AI进步为人们提供了从文本中生成图像的有力方法。最近的工作表明,文本到图像的几代人能够代表广泛的主题和艺术风格。然而,将文本快速转化为视觉信息是困难的。在本文中,我们用Opal这个制作文本到图像的几代人用于编辑插图的系统来应对这一挑战。根据文章文本,Opal通过结构化搜索视觉概念来引导用户,并提供管道,让用户能够根据文章的语气、主题和意图的插图风格进行插图。我们的评估表明,Opal高效地生成了多种编辑插图、图形资产和概念理念。Opal用户在一代中效率更高,产生的可用成果比用户少了两倍多。我们最后讨论结构化和快速探索如何帮助用户更好地了解人类人工合成系统的能力。