In the dominant paradigm for designing equitable machine learning systems, one works to ensure that model predictions satisfy various fairness criteria, such as parity in error rates across race, gender, and other legally protected traits. That approach, however, typically divorces predictions from the downstream outcomes they ultimately affect, and, as a result, can induce unexpected harms. Here we present an alternative framework for fairness that directly anticipates the consequences of actions. Stakeholders first specify preferences over the possible outcomes of an algorithmically informed decision-making process. For example, lenders may prefer extending credit to those most likely to repay a loan, while also preferring similar lending rates across neighborhoods. One then searches the space of decision policies to maximize the specified utility. We develop and describe a method for efficiently learning these optimal policies from data for a large family of expressive utility functions, facilitating a more holistic approach to equitable decision-making.
翻译:在设计公平机器学习系统的主导范式中,人们努力确保模型预测符合各种公平标准,如种族、性别和其他受法律保护的特性之间的差错率等。然而,这种方法通常会将预测与其最终影响的下游结果相分离,从而可能导致意外的伤害。这里我们提出了一个直接预测行动后果的公平性替代框架。利益攸关方首先具体说明对算法上知情的决策过程的可能结果的偏好。例如,放款人可能更愿意向最可能偿还贷款的人提供信贷,同时也倾向于在社区之间采用类似的贷款率。然后,我们探索决策政策的空间,以尽量扩大规定的效用。我们制定和描述一种方法,以便从具有表达效用功能的大家庭的数据中有效地学习这些最佳政策,促进以更全面的方式进行公平的决策。