The awareness and mitigation of biases are of fundamental importance for the fair and transparent use of contextual language models, yet they crucially depend on the accurate detection of biases as a precursor. Consequently, numerous bias detection methods have been proposed, which vary in their approach, the considered type of bias, and the data used for evaluation. However, while most detection methods are derived from the word embedding association test for static word embeddings, the reported results are heterogeneous, inconsistent, and ultimately inconclusive. To address this issue, we conduct a rigorous analysis and comparison of bias detection methods for contextual language models. Our results show that minor design and implementation decisions (or errors) have a substantial and often significant impact on the derived bias scores. Overall, we find the state of the field to be both worse than previously acknowledged due to systematic and propagated errors in implementations, yet better than anticipated since divergent results in the literature homogenize after accounting for implementation errors. Based on our findings, we conclude with a discussion of paths towards more robust and consistent bias detection methods.
翻译:对偏见的认识和减少偏见对于公平和透明地使用上下文语言模式至关重要,但关键取决于准确发现偏见作为先兆。因此,提出了许多偏见检测方法,这些方法在方法、考虑的偏见类型以及用于评价的数据方面各不相同。然而,虽然大多数检测方法来自静态嵌入词词的嵌入协会测试,但所报告的结果各不相同、不一致,最终没有结果。为了解决这一问题,我们对背景语言模型的偏差检测方法进行严格分析和比较。我们的结果显示,小型设计和实施决定(或错误)对衍生偏差分有重大和经常显著的影响。总体而言,我们发现,由于实施过程中的系统和传播错误,实地状况都比以往承认的差,但比预期的好,因为文献中在计算执行错误后出现差异的结果是同质的。根据我们的调查结果,我们最后讨论了如何找到更稳健和一致的偏差检测方法。