Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked Language Models such as BERT. We analyzed speciesist bias against 46 animal names using template-based and corpus-extracted sentences containing speciesist (or non-speciesist) language. We found that pre-trained masked language models tend to associate harmful words with nonhuman animals and have a bias toward using speciesist language for some nonhuman animal names. Our code for reproducing the experiments will be made available on GitHub.
翻译:现有各种研究分析了民族语言方案模式所继承的社会偏见,这些偏见可能直接或间接地伤害人,因此以前的研究只侧重于人类属性。然而,直到最近为止,没有研究民族法律方案关于非人类的社会偏见。在本文中,我们分析了对非人类动物的偏见,如英国蒙面语言模式(如BERT)所固有的物种偏见。我们利用含有物种学家(或非物种)语言的模板和肉类抽取的句子,分析了对46个动物名称的物种偏见。我们发现,经过培训的蒙面语言模式倾向于将有害语言与非人类动物联系起来,并偏向于使用物种语言来取代某些非人类动物名称。我们复制实验的代码将在GitHub上公布。