We propose a fairness-aware learning framework that mitigates intersectional subgroup bias associated with protected attributes. Prior research has primarily focused on mitigating one kind of bias by incorporating complex fairness-driven constraints into optimization objectives or designing additional layers that focus on specific protected attributes. We introduce a simple and generic bias mitigation approach that prevents models from learning relationships between protected attributes and output variable by reducing mutual information between them. We demonstrate that our approach is effective in reducing bias with little or no drop in accuracy. We also show that the models trained with our learning framework become causally fair and insensitive to the values of protected attributes. Finally, we validate our approach by studying feature interactions between protected and non-protected attributes. We demonstrate that these interactions are significantly reduced when applying our bias mitigation.
翻译:我们建议一个公平认知学习框架,减少与受保护属性相关的交叉分组偏见; 先前的研究主要侧重于减少一种偏见,将复杂的公平驱动的制约因素纳入优化目标,或设计更多侧重于特定受保护属性的层次; 我们采用简单、通用的减少偏见办法,通过减少受保护属性和产出变量之间的相互信息,防止模式学习受保护属性和产出变量之间的关系; 我们证明我们的方法有效地减少了偏见,很少或没有降低准确性; 我们还表明,经过培训的学习框架模式在因果关系上变得公平,对受保护属性的价值不敏感; 最后,我们通过研究受保护属性和非受保护属性之间的特征互动来验证我们的做法; 我们证明,在应用减少偏见时,这些互动大大减少。