Fairness in automated decision-making systems has gained increasing attention as their applications expand to real-world high-stakes domains. To facilitate the design of fair ML systems, it is essential to understand the potential trade-offs between fairness and predictive power, and the construction of the optimal predictor under a given fairness constraint. In this paper, for general classification problems under the group fairness criterion of demographic parity (DP), we precisely characterize the trade-off between DP and classification accuracy, referred to as the minimum cost of fairness. Our insight comes from the key observation that finding the optimal fair classifier is equivalent to solving a Wasserstein-barycenter problem under $\ell_1$-norm restricted to the vertices of the probability simplex. Inspired by our characterization, we provide a construction of an optimal fair classifier achieving this minimum cost via the composition of the Bayes regressor and optimal transports from its output distributions to the barycenter. Our construction naturally leads to an algorithm for post-processing any pre-trained predictor to satisfy DP fairness, complemented with finite sample guarantees. Experiments on real-world datasets verify and demonstrate the effectiveness of our approaches.
翻译:自动化决策系统中的公平性已日益引起人们的注意,因为其应用范围已扩大到真实世界的高占用域。为了便利设计公平的 ML 系统,必须理解公平性和预测力之间的潜在权衡,以及在某种公平性限制下建造最佳预测器。在本文中,为了根据人口均等的集团公平标准(DP)进行一般分类问题,我们精确地描述DP与分类准确性之间的权衡,称为公平性的最低成本。我们从以下关键观察中了解到,即找到最佳的公平性分类器相当于在$\ell_1美元-诺姆下解决一个瓦塞斯坦-巴赫中心问题,但仅限于概率简单值的顶端。受我们特征的启发,我们提供了一个最佳的公平分类器,通过海湾回归器的构成和从产出分布到温室中心的最佳运输实现这一最低成本。我们的设计自然导致一种算法,用于处理任何事先培训的预测器,以达到DP的公平性,并辅之以有限的抽样保证。在现实世界的数据设置上进行实验,核查并展示我们的方法的有效性。