Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications. This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Unlike existing models that target a single type of fairness, our model jointly optimizes for two fairness criteria - group fairness and counterfactual fairness - and hence makes fairer predictions at both the group and individual levels. Our model uses contrastive loss to generate embeddings that are indistinguishable for each protected group, while forcing the embeddings of counterfactual pairs to be similar. It then uses a self-knowledge distillation method to maintain the quality of representation for the downstream tasks. Extensive analysis over multiple datasets confirms the model's validity and further shows the synergy of jointly addressing two fairness criteria, suggesting the model's potential value in fair intelligent Web applications.
翻译:算法公平已成为一个重要的机器学习问题,特别是对任务关键网络应用程序而言。 这项工作是一个自监督的模式,称为“ DualFair ”, 它可以从所学的演示中降低性别和种族等敏感属性。 与针对单一公平模式的现有模型不同,我们的模式共同优化了两个公平标准――群体公平和反事实公平――从而在群体和个人层面作出更公平的预测。 我们的模式利用对比性损失生成每个受保护群体无法分化的嵌入,同时迫使反事实配对嵌入相似。 然后,它使用自学蒸馏法来保持下游任务代表性的质量。 对多个数据集的广泛分析证实了模型的有效性,并进一步展示了共同解决两个公平标准的协同作用,在公平智能网络应用中提出了模型的潜在价值。</s>