It is of critical importance to be aware of the historical discrimination embedded in the data and to consider a fairness measure to reduce bias throughout the predictive modeling pipeline. Given various notions of fairness defined in the literature, investigating the correlation and interaction among metrics is vital for addressing unfairness. Practitioners and data scientists should be able to comprehend each metric and examine their impact on one another given the context, use case, and regulations. Exploring the combinatorial space of different metrics for such examination is burdensome. To alleviate the burden of selecting fairness notions for consideration, we propose a framework that estimates the correlation among fairness notions. Our framework consequently identifies a set of diverse and semantically distinct metrics as representative for a given context. We propose a Monte-Carlo sampling technique for computing the correlations between fairness metrics by indirect and efficient perturbation in the model space. Using the estimated correlations, we then find a subset of representative metrics. The paper proposes a generic method that can be generalized to any arbitrary set of fairness metrics. We showcase the validity of the proposal using comprehensive experiments on real-world benchmark datasets.
翻译:鉴于文献中界定的各种公平概念,调查各种指标之间的相互关系和相互作用对于解决不公平问题至关重要。从业者和数据科学家应当能够理解每一项指标,并根据情况、使用案例和规章来审查它们相互的影响。探索不同指标的组合空间对于进行这种审查是累赘的。为了减轻选择公平概念供考虑的负担,我们提议了一个框架,用以估计公平概念之间的相互关系。因此,我们的框架确定了一系列不同和语义上不同的指标作为特定背景的代表性。我们建议采用蒙特-卡尔洛取样技术来计算在模型空间中间接和有效渗透的公平指标之间的相互关系。利用估计的相关性,我们然后找到一组具有代表性的指标。文件提出了一个通用方法,可以概括到任何一套任意的公平指标中。我们用现实世界基准数据集的全面实验来展示该提案的有效性。