Many modern learning algorithms mitigate bias by enforcing fairness across coarsely-defined groups related to a sensitive attribute like gender or race. However, the same algorithms seldom account for the within-group biases that arise due to the heterogeneity of group members. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential type of discrimination that may be exhibited by automated decision-making systems, even when these systems achieve group fairness objectives. We study this issue through the lens of gender bias in occupation classification from biographies. We quantify SNoB by measuring how an algorithm's predictions are associated with conformity to gender norms, which is measured using a machine learning approach. This framework reveals that for classification tasks related to male-dominated occupations, fairness-aware classifiers favor biographies written in ways that align with masculine gender norms. We compare SNoB across fairness intervention techniques and show that post-processing interventions do not mitigate this type of bias at all.
翻译:许多现代学习算法通过在与性别或种族等敏感属性有关的粗略定义群体中实行公平来减少偏见,但同样的算法很少考虑到由于群体成员的多样性而产生的群体内部偏见。在这项工作中,我们把社会Norm Bias(SNoB)定性为一种微妙但随之而来的歧视,这种歧视可以通过自动决策系统表现出来,即使这些系统实现了群体公平目标。我们通过从生物学中职业分类中的性别偏见的视角来研究这一问题。我们用SNoB量化SNoB,衡量一种算法的预测如何与性别规范相符,而性别规范是通过机器学习方法衡量的。这个框架显示,在与男性主导的职业有关的分类任务中,公平意识分类者倾向于以符合男性性别规范的方式编写生物特征。我们将SNoB与公平干预技术作比较,并表明后处理干预措施不会完全减轻这种类型的偏见。