With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in English static word embeddings. State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. Current methods measure pre-defined social and intersectional biases that appear in particular contexts defined by sentence templates. Dispensing with templates, we introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models by incorporating a random-effects model. Experiments on social and intersectional biases show that CEAT finds evidence of all tested biases and provides comprehensive information on the variance of effect magnitudes of the same bias in different contexts. All the models trained on English corpora that we study contain biased representations. Furthermore, we develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings in addition to measuring them in contextualized word embeddings. We present the first algorithmic bias detection findings on how intersectional group members are strongly associated with unique emergent biases that do not overlap with the biases of their constituent minority identities. IBD and EIBD achieve high accuracy when detecting the intersectional and emergent biases of African American females and Mexican American females. Our results indicate that biases at the intersection of race and gender associated with members of multiple minority groups, such as African American females and Mexican American females, have the highest magnitude across all neural language models.
翻译:由于语言的统计规律反映了隐含的人类偏见,因此有可能测量英语静态字嵌入中的偏见。 状态神经语言模型产生动态的字嵌入取决于该词出现的背景。 目前的方法测量了在句式模板定义的特定背景下出现的预先定义的社会和交叉偏见。 我们采用模板,引入了背景化嵌入协会测试(CEAT),该测试可以通过纳入随机效应模型来总结神经语言模型的整体偏向程度。 社会和交叉偏向实验表明,CEAT发现了所有经过测试的最高偏向证据,并提供了在不同背景下同一偏向影响大小的动态词嵌入信息。 现有方法测量了我们研究的英语肉团中包含有偏差的表达方式。 此外,我们开发了两种方法,即跨区间比亚检测(IBD)和新兴比亚斯交叉检测(EIIBD), 自动识别交叉偏见和从静态语言嵌入的词,同时测量其相关的直隐含性词的美洲女性的性别偏见, 并提供了不同背景性性别结构的跨美洲成员。