We study fairness in the context of classification where the performance is measured by the area under the curve (AUC) of the receiver operating characteristic. AUC is commonly used when both Type I (false positive) and Type II (false negative) errors are important. However, the same classifier can have significantly varying AUCs for different protected groups and, in real-world applications, it is often desirable to reduce such cross-group differences. We address the problem of how to select additional features to most greatly improve AUC for the disadvantaged group. Our results establish that the unconditional variance of features does not inform us about AUC fairness but class-conditional variance does. Using this connection, we develop a novel approach, fairAUC, based on feature augmentation (adding features) to mitigate bias between identifiable groups. We evaluate fairAUC on synthetic and real-world (COMPAS) datasets and find that it significantly improves AUC for the disadvantaged group relative to benchmarks maximizing overall AUC and minimizing bias between groups.
翻译:我们研究分类方面的公平性,根据接收器操作特性的曲线(AUC)下区域测量性能。AUC通常在类型I(假正)和类型II(假正)错误都很重要时使用,但同一分类者可以对不同受保护群体有差异的AUC,在现实世界应用中,通常有必要减少这种跨群体的差异。我们处理如何选择额外特征的问题,以便最大幅度地改善对弱势群体的AUC。我们的结果证明,无条件的特征差异并不告诉我们AUC的公平性,但等级条件差异确实。我们利用这一联系,根据特性增强(增加特征)制定新的方法,即公平AUC,以缓解可识别群体之间的偏差。我们评估合成和真实世界(COMPAS)数据集的公平性,发现它大大改进了弱势群体的AUC,以衡量整个AUC的最大化,并尽量减少群体之间的偏差。