Many popular algorithmic fairness measures depend on the joint distribution of predictions, outcomes, and a sensitive feature like race or gender. These measures are sensitive to distribution shift: a predictor which is trained to satisfy one of these fairness definitions may become unfair if the distribution changes. In performative prediction settings, however, predictors are precisely intended to induce distribution shift. For example, in many applications in criminal justice, healthcare, and consumer finance, the purpose of building a predictor is to reduce the rate of adverse outcomes such as recidivism, hospitalization, or default on a loan. We formalize the effect of such predictors as a type of concept shift-a particular variety of distribution shift-and show both theoretically and via simulated examples how this causes predictors which are fair when they are trained to become unfair when they are deployed. We further show how many of these issues can be avoided by using fairness definitions that depend on counterfactual rather than observable outcomes.
翻译:许多流行的算法公平措施取决于预测、结果和种族或性别等敏感特征的共同分布。这些措施对分配变化十分敏感:如果分配变化,经过培训满足这些公平定义之一的预测器可能会变得不公平。然而,在业绩预测设置中,预测器的精确意图恰恰是为了促使分配变化。例如,在刑事司法、保健和消费者融资的许多应用中,建立预测器的目的是降低不良结果(如累犯、住院或贷款违约)的速度。我们将这些预测器作为概念转变类型(一种特定类型的分配转移)的影响正式化,在理论上和通过模拟实例显示,这如何导致预测器在部署时被训练为不公平时会变得公平。我们进一步说明如何通过使用依赖反事实而非观察结果的公平定义来避免这些问题中的许多问题。