Social media has exacerbated the promotion of Western beauty norms, leading to negative self-image, particularly in women and girls, and causing harm such as body dysmorphia. Increasingly content on the internet has been artificially generated, leading to concerns that these norms are being exaggerated. The aim of this work is to study how generative AI models may encode 'beauty' and erase 'ugliness', and discuss the implications of this for society. To investigate these aims, we create two image generation pipelines: a text-to-image model and a text-to-language model-to image model. We develop a structured beauty taxonomy which we use to prompt three language models (LMs) and two text-to-image models to cumulatively generate 5984 images using our two pipelines. We then recruit women and non-binary social media users to evaluate 1200 of the images through a Likert-scale within-subjects study. Participants show high agreement in their ratings. Our results show that 86.5% of generated images depicted people with lighter skin tones, 22% contained explicit content despite Safe for Work (SFW) training, and 74% were rated as being in a younger age demographic. In particular, the images of non-binary individuals were rated as both younger and more hypersexualised, indicating troubling intersectional effects. Notably, prompts encoded with 'negative' or 'ugly' beauty traits (such as "a wide nose") consistently produced higher Not SFW (NSFW) ratings regardless of gender. This work sheds light on the pervasive demographic biases related to beauty standards present in generative AI models -- biases that are actively perpetuated by model developers, such as via negative prompting. We conclude by discussing the implications of this on society, which include pollution of the data streams and active erasure of features that do not fall inside the stereotype of what is considered beautiful by developers.
翻译:社交媒体加剧了西方审美标准的推广,导致负面自我形象(尤其在女性与女童群体中),并引发如身体畸形恐惧症等危害。互联网内容日益由人工生成,引发人们担忧这些标准正被进一步夸大。本研究旨在探究生成式人工智能模型如何编码'美'并抹除'丑',并讨论其社会影响。为此,我们构建了两个图像生成流程:文本到图像模型与文本到语言模型到图像模型。我们开发了结构化审美分类体系,并以此提示三个语言模型(LMs)和两个文本到图像模型,通过双流程累计生成5984张图像。随后招募女性及非二元性别社交媒体用户,通过李克特量表进行组内研究,对其中1200张图像进行评估。参与者评分呈现高度一致性。结果显示:86.5%的生成图像描绘浅肤色人种,22%包含明确成人内容(尽管经过安全训练),74%被判定属于年轻年龄层。特别值得注意的是,非二元性别个体的图像被评定为更年轻且更具过度性征化,揭示了令人担忧的交叉性效应。尤为关键的是,编码'负面'或'丑陋'审美特征(如'宽鼻梁')的提示词,无论性别均持续产生更高的非安全内容(NSFW)评分。本研究揭示了生成式AI模型中普遍存在的、与审美标准相关的人口统计学偏见——这些偏见正通过开发者行为(如负面提示)被持续强化。最后,我们讨论了其对社会的深远影响,包括数据流污染以及对开发者审美刻板印象之外特征的系统性抹除。