Algorithmic fairness is becoming increasingly important in data mining and machine learning, and one of the most fundamental notions is group fairness. The vast majority of the existing works on group fairness, with a few exceptions, primarily focus on debiasing with respect to a single sensitive attribute, despite the fact that the co-existence of multiple sensitive attributes (e.g., gender, race, marital status, etc.) in the real-world is commonplace. As such, methods that can ensure a fair learning outcome with respect to all sensitive attributes of concern simultaneously need to be developed. In this paper, we study multi-group fairness in machine learning (MultiFair), where statistical parity, a representative group fairness measure, is guaranteed among demographic groups formed by multiple sensitive attributes of interest. We formulate it as a mutual information minimization problem and propose a generic end-to-end algorithmic framework to solve it. The key idea is to leverage a variational representation of mutual information, which considers the variational distribution between learning outcomes and sensitive attributes, as well as the density ratio between the variational and the original distributions. Our proposed framework is generalizable to many different settings, including other statistical notions of fairness, and could handle any type of learning task equipped with a gradient-based optimizer. Empirical evaluations in the fair classification task on three real-world datasets demonstrate that our proposed framework can effectively debias the classification results with minimal impact to the classification accuracy.
翻译:在数据挖掘和机器学习中,伸张公平正在变得日益重要,而一个最基本的概念是群体公平。除了少数例外,绝大多数现有的关于群体公平的工作,除了少数例外外,主要侧重于在单一敏感属性方面贬低偏见,尽管现实世界中多重敏感属性(如性别、种族、婚姻状况等)的共存是司空见惯的,但现实世界中多重敏感属性(如性别、种族、婚姻状况等)的共存是常见的。因此,需要同时制定方法,以确保所有敏感关切属性的公平学习结果。在本文件中,我们研究机器学习(MultiFair)中的多组公平(MultiFair),其中统计平等是具有代表性的群体公平度衡量标准,通过多重敏感属性属性的敏感属性,在人口群体中,我们提出的框架可以有效地将统计平等性纳入许多不同背景,包括其他统计性公平性统计性类别,从而展示我们提出的最起码的分类框架。我们提议的框架可以有效地向不同背景展示真实性、最起码的准确性、最起码的分类。