Mitigating bias in machine learning systems requires refining our understanding of bias propagation pathways: from societal structures to large-scale data to trained models to impact on society. In this work, we focus on one aspect of the problem, namely bias amplification: the tendency of models to amplify the biases present in the data they are trained on. A metric for measuring bias amplification was introduced in the seminal work by Zhao et al. (2017); however, as we demonstrate, this metric suffers from a number of shortcomings including conflating different types of bias amplification and failing to account for varying base rates of protected classes. We introduce and analyze a new, decoupled metric for measuring bias amplification, $\text{BiasAmp}_{\rightarrow}$ (Directional Bias Amplification). We thoroughly analyze and discuss both the technical assumptions and the normative implications of this metric. We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use of confidence intervals due to fluctuations in the fairness of models across runs, and discussing the limitations of what this metric captures. Throughout this paper, we work to provide an interrogative look at the technical measurement of bias amplification, guided by our normative ideas of what we want it to encompass.
翻译:缩小机器学习系统中的偏见要求我们加深对偏见传播途径的理解:从社会结构到大规模数据,到对社会产生影响的经过培训的模型;在这项工作中,我们侧重于问题的一个方面,即偏见的扩大:模型扩大其所培训数据中存在的偏见的趋势;赵等人(2017年)在开创性工作中引入了衡量偏见扩大的衡量标准;然而,正如我们所显示的那样,这一指标存在若干缺陷,包括从社会结构到大规模数据到对社会产生影响的经过培训的模型的不同比率。我们引入并分析衡量偏见扩大的新的、分解的衡量标准,即:偏差扩大:模型扩大其所培训的数据中存在的偏见的倾向。我们透彻地分析和讨论该指标的技术假设和规范影响。我们建议如何衡量该指标的衡量,告诫不要预测敏感的属性,鼓励使用因各种偏见的扩大和没有考虑到不同类型模式的波动而产生的信任间隔,并讨论该指标所捕捉到的局限性。我们在整个文件中努力提供一种检验性概念,以便从技术角度分析。