Although large pre-trained language models have achieved great success in many NLP tasks, it has been shown that they reflect human biases from their pre-training corpora. This bias may lead to undesirable outcomes when these models are applied in real-world settings. In this paper, we investigate the bias present in monolingual BERT models across a diverse set of languages (English, Greek, and Persian). While recent research has mostly focused on gender-related biases, we analyze religious and ethnic biases as well and propose a template-based method to measure any kind of bias, based on sentence pseudo-likelihood, that can handle morphologically complex languages with gender-based adjective declensions. We analyze each monolingual model via this method and visualize cultural similarities and differences across different dimensions of bias. Ultimately, we conclude that current methods of probing for bias are highly language-dependent, necessitating cultural insights regarding the unique ways bias is expressed in each language and culture (e.g. through coded language, synecdoche, and other similar linguistic concepts). We also hypothesize that higher measured social biases in the non-English BERT models correlate with user-generated content in their training.
翻译:尽管经过培训的大型语言模式在许多国家语言方案任务中取得了巨大成功,但已经表明,这些语言模式反映了其培训前社团的人类偏见,如果将这些模式应用于现实世界环境中,这种偏见可能导致不可取的结果。在本文件中,我们调查了单一语言的英语、希腊语和波斯语多种语言(英语、希腊语和波斯语)在单一语言BERT模式中存在的偏见。虽然最近的研究主要侧重于与性别有关的偏见,但我们分析了宗教和种族偏见,并提出了一个基于模板的方法,以衡量任何类型的偏见,该方法以句子假象为基础,可以处理形态上复杂的语言,并带有基于性别的形容词宽减。我们通过这种方法分析每一种单一语言模式,并直观地看待文化上的异同和差异。最后,我们得出结论,目前对偏见的预测方法高度依赖语言,需要从文化上深入了解每一种语言和文化中表达偏见的独特方式(例如,通过编码语言、Synecdoche语和其他类似的语言概念),我们还假设了非英语ERBERERT模型中衡量较高社会偏见的程度。