Fair machine learning aims to mitigate the biases of model predictions against certain subpopulations regarding sensitive attributes such as race and gender. Among the many existing fairness notions, counterfactual fairness measures the model fairness from a causal perspective by comparing the predictions of each individual from the original data and the counterfactuals. In counterfactuals, the sensitive attribute values of this individual had been modified. Recently, a few works extend counterfactual fairness to graph data, but most of them neglect the following facts that can lead to biases: 1) the sensitive attributes of each node's neighbors may causally affect the prediction w.r.t. this node; 2) the sensitive attributes may causally affect other features and the graph structure. To tackle these issues, in this paper, we propose a novel fairness notion - graph counterfactual fairness, which considers the biases led by the above facts. To learn node representations towards graph counterfactual fairness, we propose a novel framework based on counterfactual data augmentation. In this framework, we generate counterfactuals corresponding to perturbations on each node's and their neighbors' sensitive attributes. Then we enforce fairness by minimizing the discrepancy between the representations learned from the original graph and the counterfactuals for each node. Experiments on both synthetic and real-world graphs show that our framework outperforms the state-of-the-art baselines in graph counterfactual fairness, and also achieves comparable prediction performance.
翻译:公平机器学习的目的是减少针对种族和性别等敏感属性的某些亚群体的模型预测的偏差。 在现有的许多公平概念中,反事实公平从因果角度衡量模型的公平性,比较原始数据和反事实。在反事实中,个人敏感的属性值已经改变。最近,少数作品将反事实的公正性扩大到图表数据,但多数作品忽视了可能导致偏差的以下事实:(1)每个节点的邻居的敏感属性可能因果影响预测 w.r.t.这个节点;(2) 敏感属性可能因果影响其他特征和图表结构。为了解决这些问题,我们在本文中提出了一个新的公平概念----图表反事实公平性,其中考虑到上述事实造成的偏差。为了了解图形反事实公平性,我们建议了一个基于反事实数据增强的新框架。在这个框架中,我们产生与每个节点及其邻居的敏感属性的扰动性对应的反事实性特征;(2) 敏感属性可能因果影响其他特征和图表结构。然后,我们通过从原始的图表和真实性图表中实现公平性,然后通过对比性模型显示我们从原始图表和真实性图表中获取的对比性图表的对比性。