Gender bias in word embeddings gradually becomes a vivid research field in recent years. Most studies in this field aim at measurement and debiasing methods with English as the target language. This paper investigates gender bias in static word embeddings from a unique perspective, Chinese adjectives. By training word representations with different models, the gender bias behind the vectors of adjectives is assessed. Through a comparison between the produced results and a human-scored data set, we demonstrate how gender bias encoded in word embeddings differentiates from people's attitudes.
翻译:近年来,文字嵌入中的性别偏见逐渐成为生动的研究领域,该领域的大多数研究都旨在用英语作为目标语言衡量和贬低方法,本文从独特的角度,中国形容词从静态词嵌入中调查性别偏见。通过培训不同模型的文字表述,对形容词矢量背后的性别偏见进行了评估。通过比较所产生的结果和一套人类分类数据集,我们展示了在文字嵌入中编码的性别偏见如何与人们的态度区别。