Text-guided image generation models can be prompted to generate images using nonce words adversarially designed to robustly evoke specific visual concepts. Two approaches for such generation are introduced: macaronic prompting, which involves designing cryptic hybrid words by concatenating subword units from different languages; and evocative prompting, which involves designing nonce words whose broad morphological features are similar enough to that of existing words to trigger robust visual associations. The two methods can also be combined to generate images associated with more specific visual concepts. The implications of these techniques for the circumvention of existing approaches to content moderation, and particularly the generation of offensive or harmful images, are discussed.
翻译:文字制导图像生成模型可以用非文字生成图像,对抗性地设计非文字,强有力地引用具体的视觉概念,为这种生成采用了两种方法:马卡罗尼催化法,它涉及通过将不同语言的子词组组合起来来设计隐秘混合词;和促动性促动法,它涉及设计非文字,其广泛的形态特征与现有文字的特征相当,足以触发稳健的视觉联系;两种方法也可以合并,产生与更具体的视觉概念相关的图像;讨论了这些技术对规避现有内容调适方法的影响,特别是产生攻击性或有害图像的影响。