公平分类的分级办法 (A Ranking Approach to Fair Classification)

Algorithmic decision systems are increasingly used in areas such as hiring, school admission, or loan approval. Typically, these systems rely on labeled data for training a classification model. However, in many scenarios, ground-truth labels are unavailable, and instead we have only access to imperfect labels as the result of (potentially biased) human-made decisions. Despite being imperfect, historical decisions often contain some useful information on the unobserved true labels. In this paper, we focus on scenarios where only imperfect labels are available and propose a new fair ranking-based decision system based on monotonic relationships between legitimate features and the outcome. Our approach is both intuitive and easy to implement, and thus particularly suitable for adoption in real-world settings. More in detail, we introduce a distance-based decision criterion, which incorporates useful information from historical decisions and accounts for unwanted correlation between protected and legitimate features. Through extensive experiments on synthetic and real-world data, we show that our method is fair in the sense that a) it assigns the desirable outcome to the most qualified individuals, and b) it removes the effect of stereotypes in decision-making, thereby outperforming traditional classification algorithms. Additionally, we are able to show theoretically that our method is consistent with a prominent concept of individual fairness which states that "similar individuals should be treated similarly."

翻译：在招聘、入学或贷款审批等领域,人们越来越多地使用等级决策系统。通常,这些系统依靠标签数据来培训分类模式。然而,在许多情形中,地面真实标签是不存在的,相反,我们只能通过人为决定(潜在偏向性)获得不完善的标签。尽管历史决定不完善,但往往包含一些关于未观察到的真实标签的有用信息。在本文件中,我们侧重于仅提供不完善标签的情景,并根据合法特征与结果之间的单一关系提出新的公平排序决策系统。我们的方法既不直观,又容易执行,因此特别适合在现实世界环境中采用。更详细地说,我们采用远程决策标准,从历史决定中收集有用信息,说明受保护与合法特征之间不必要的关联。我们通过对合成和现实世界数据的广泛实验,表明我们的方法是公平的,从某种意义上说,我们的方法将理想的结果赋予了最合格的个人,并且(b)它消除了陈规定型观念在决策中的影响,因此很容易执行,因此在现实世界环境中特别适合采用。