We study fairness in the context of classification where the performance is measured by the area under the curve (AUC) of the receiver operating characteristic. AUC is commonly used to measure the performance of prediction models. The same classifier can have significantly varying AUCs for different protected groups and, in real-world applications, it is often desirable to reduce such cross-group differences. We address the problem of how to acquire additional features to most greatly improve AUC for the disadvantaged group. We develop a novel approach, fairAUC, based on feature augmentation (adding features) to mitigate bias between identifiable groups. The approach requires only a few summary statistics to offer provable guarantees on AUC improvement, and allows managers flexibility in determining where in the fairness-accuracy tradeoff they would like to be. We evaluate fairAUC on synthetic and real-world datasets and find that it significantly improves AUC for the disadvantaged group relative to benchmarks maximizing overall AUC and minimizing bias between groups.
翻译:我们研究分类方面的公平性,根据接收器操作特性曲线(AUC)下的区域测量性能。AUC通常用于测量预测模型的性能。同一分类者可以对不同的受保护群体使用差异很大的AUC,在现实应用中,通常宜减少这种跨群体的差异。我们处理如何为处境不利群体获取额外特征以极大改善AUC的问题。我们根据特征增强(增加特征)制定了一种新颖的方法,即公平AUC,以减少可识别群体之间的偏差。这种方法只需要少量的简要统计数据,就AUC的改进提供可证实的保证,并允许管理人员灵活地确定他们愿意在公平-准确性交易中的位置。我们评估合成和现实世界数据集的公平性AUC,发现它大大改进了弱势群体的AUC,以作为衡量最大程度实现AUC和尽量减少群体之间的偏差的基准。