Although machine learning classifiers have been increasingly used in high-stakes decision making (e.g., cancer diagnosis, criminal prosecution decisions), they have demonstrated biases against underrepresented groups. Standard definitions of fairness require access to sensitive attributes of interest (e.g., gender and race), which are often unavailable. In this work we demonstrate that in these settings where sensitive attributes are unknown, one can still reliably estimate and ultimately control for fairness by using proxy sensitive attributes derived from a sensitive attribute predictor. Specifically, we first show that with just a little knowledge of the complete data distribution, one may use a sensitive attribute predictor to obtain upper and lower bounds of the classifier's true fairness metric. Second, we demonstrate how one can provably control for fairness with respect to the true sensitive attributes by controlling for fairness with respect to the proxy sensitive attributes. Our results hold under assumptions that are significantly milder than previous works. We illustrate our results on a series of synthetic and real datasets.
翻译:虽然机器学习分类在高取量决策(如癌症诊断、刑事起诉决定等)中越来越多地使用机器学习分类方法,但它们表现出了对代表性不足群体的偏见。标准的公平定义要求获得往往无法获得的敏感利益属性(如性别和种族),在这项工作中,我们证明,在敏感属性未知的环境下,人们仍然可以可靠地估计并最终控制公平性,使用敏感属性预测器产生的代理敏感属性。具体地说,我们首先显示,只要对完整的数据分布知之甚少,就可以使用敏感属性预测器获得分类员真实公平度的上限和下限。第二,我们证明如何通过控制对代理敏感属性的公平性,对真正敏感属性的公正性进行可辨别的控制。我们的结果依据的假设比以前的工作要温和得多。我们用一系列合成和真实数据集来说明我们的结果。