BERT and other large-scale language models (LMs) contain gender and racial bias. They also exhibit other dimensions of social bias, most of which have not been studied in depth, and some of which vary depending on the language. In this paper, we study ethnic bias and how it varies across languages by analyzing and mitigating ethnic bias in monolingual BERT for English, German, Spanish, Korean, Turkish, and Chinese. To observe and quantify ethnic bias, we develop a novel metric called Categorical Bias score. Then we propose two methods for mitigation; first using a multilingual model, and second using contextual word alignment of two monolingual models. We compare our proposed methods with monolingual BERT and show that these methods effectively alleviate the ethnic bias. Which of the two methods works better depends on the amount of NLP resources available for that language. We additionally experiment with Arabic and Greek to verify that our proposed methods work for a wider variety of languages.
翻译:BERT和其他大型语言模式含有性别和种族偏见,它们也表现出社会偏见的其他方面,其中多数尚未深入研究,有些则因语言而异。在本文中,我们研究族裔偏见以及不同语言的种族偏见,方法是用英语、德语、西班牙语、韩语、土耳其语和中文单语语言语言语言的BERT来分析和减少族裔偏见。为了观察和量化族裔偏见,我们开发了一个叫作分类比亚斯分的新颖的衡量标准。然后我们提出两种缓解方法;首先使用多语种模式,其次是使用两种单一语言模式的背景词对齐。我们比较了我们提出的方法,并表明这些方法有效地缓解了种族偏见。这两种方法中的哪一种方法更适合用于该语言的NLP资源量。我们还与阿拉伯语和希腊语进行了进一步试验,以核实我们提出的方法是否适用于更广泛的语言。