Mitigating bias in machine learning systems requires refining our understanding of bias propagation pathways: from societal structures to large-scale data to trained models to impact on society. In this work, we focus on one aspect of the problem, namely bias amplification: the tendency of models to amplify the biases present in the data they are trained on. A metric for measuring bias amplification was introduced in the seminal work by Zhao et al. (2017); however, as we demonstrate, this metric suffers from a number of shortcomings including conflating different types of bias amplification and failing to account for varying base rates of protected attributes. We introduce and analyze a new, decoupled metric for measuring bias amplification, $\text{BiasAmp}_{\rightarrow}$ (Directional Bias Amplification). We thoroughly analyze and discuss both the technical assumptions and normative implications of this metric. We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use of confidence intervals due to fluctuations in the fairness of models across runs, and discussing the limitations of what this metric captures. Throughout this paper, we work to provide an interrogative look at the technical measurement of bias amplification, guided by our normative ideas of what we want it to encompass. Code is located at https://github.com/princetonvisualai/directional-bias-amp
翻译:缩小机器学习系统中的偏见要求我们加深对偏见传播途径的理解:从社会结构到大规模数据,到经过培训的对社会产生影响的模式。在这项工作中,我们侧重于问题的一个方面,即偏见的扩大:模型扩大其所培训数据中存在的偏见的趋势。Zhao等人(2017年)在开创性工作中引入了衡量偏见扩大的衡量标准;然而,正如我们所显示的那样,这一衡量标准存在一些缺陷,包括从社会结构到大规模数据,到对社会产生影响的经过培训的模型的不同基本比例进行整合。我们引入并分析衡量偏见扩大的新的、分解的衡量标准,即:偏见扩大:模型扩大其所培训的数据中存在的偏见的倾向。我们透彻地分析和讨论该标准的技术假设和规范影响。我们提出衡量建议,告诫不要预测敏感的属性,鼓励由于各种模式的公平性波动而使用信任间隔期,并讨论该标准捕捉的局限性。我们在整个文件中,要努力提供一种用于检验性分析性概念,即我们如何根据技术标准进行分析/分析。