Counterfactual fairness is an approach to AI fairness that tries to make decisions based on the outcomes that an individual with some kind of sensitive status would have had without this status. This paper proposes Double Machine Learning (DML) Fairness which analogises this problem of counterfactual fairness in regression problems to that of estimating counterfactual outcomes in causal inference under the Potential Outcomes framework. It uses arbitrary machine learning methods to partial out the effect of sensitive variables on nonsensitive variables and outcomes. Assuming that the effects of the two sets of variables are additively separable, outcomes will be approximately equalised and individual-level outcomes will be counterfactually fair. This paper demonstrates the approach in a simulation study pertaining to discrimination in workplace hiring and an application on real data estimating the GPAs of law school students. It then discusses when it is appropriate to apply such a method to problems of real-world discrimination where constructs are conceptually complex and finally, whether DML Fairness can achieve justice in these settings.
翻译:反事实公平是一种AI公平性方法,试图基于某些敏感状态下个体的结果做出决策。本文提出了双重机器学习(DML)公平性,该方法将回归问题中的反事实公平问题类比于在潜在结果框架下估计因果推断中的反事实结果问题。使用任意机器学习方法来部分排除敏感变量对非敏感变量和结果的影响。假设两组变量的效果是可加性可分的,则结果将近似平等,并且个体层面的结果将是反事实公平的。本文在有关工作场所招聘歧视的模拟研究和真实数据上应用此方法估计法学院学生的平均学分绩点。然后讨论了在概念上复杂的现实世界歧视问题中适用这种方法的时机以及DML Fairness是否能够实现正义。