Text-to-image generation methods produce high-resolution and high-quality images, but these methods should not produce immoral images that may contain inappropriate content from the commonsense morality perspective. Conventional approaches often neglect these ethical concerns, and existing solutions are limited in avoiding immoral image generation. In this paper, we aim to automatically judge the immorality of synthesized images and manipulate these images into a moral alternative. To this end, we build a model that has the three main primitives: (1) our model recognizes the visual commonsense immorality of a given image, (2) our model localizes or highlights immoral visual (and textual) attributes that make the image immoral, and (3) our model manipulates a given immoral image into a morally-qualifying alternative. We experiment with the state-of-the-art Stable Diffusion text-to-image generation model and show the effectiveness of our ethical image manipulation. Our human study confirms that ours is indeed able to generate morally-satisfying images from immoral ones. Our implementation will be publicly available upon publication to be widely used as a new safety checker for text-to-image generation models.
翻译:文本到图像生成方法产生高分辨率和高质量图像,但这些方法不应产生可能包含从常识道德角度不适当内容的不道德图像。常规方法往往忽视这些道德关切,现有解决方案在避免不道德图像生成方面有限。在本文中,我们的目标是自动判断合成图像的不道德性,并将这些图像转化为道德的替代。为此,我们建立了一个模式,它有三个主要原始特征:(1) 我们的模式承认某一图像的视觉常识不道德,(2) 我们的模型本地化或突出使该图像不道德的不道德视觉(和文本)属性,(3) 我们的模型将给定的不道德图像操纵为道德的替代。 我们实验了最先进的“稳定发泡文本到图像生成模型”模型,并展示了我们道德图像操纵的有效性。 我们的人类研究证实,我们确实能够从不道德的图像中产生道德满意的图像。我们的实施将在出版物上公开公布,作为文本到图像生成模型的新安全检查器。