模型解释差异作为公平性诊断</s> (Model Explanation Disparities as a Fairness Diagnostic)

In recent years, there has been a flurry of research focusing on the fairness of machine learning models, and in particular on quantifying and eliminating bias against protected subgroups. One line of work generalizes the notion of protected subgroups beyond simple discrete classes by introducing the notion of a "rich subgroup", and seeks to train models that are calibrated or equalize error rates with respect to these richer subgroup classes. Largely orthogonally, local model explanation methods have been developed that given a classifier h and test point x, attribute influence for the prediction h(x) to the individual features of x. This raises a natural question: Do local model explanation methods attribute different feature importance values on average across different protected subgroups, and can we detect these disparities efficiently? If the model places high weight on a given feature in a specific protected subgroup, but not on the dataset overall (or vice versa), this could be a potential indicator of bias in the predictive model or the underlying data generating process, and is at the very least a useful diagnostic that signals the need for a domain expert to delve deeper. In this paper, we formally introduce the notion of feature importance disparity (FID) in the context of rich subgroups, design oracle-efficent algorithms to identify large FID subgroups, and conduct a thorough empirical analysis that establishes auditing for FID as an important method to investigate dataset bias. Our experiments show that across 4 datasets and 4 common feature importance methods our algorithms find (feature, subgroup) pairs that simultaneously: (i) have subgroup feature importance that is often an order of magnitude different than the importance on the dataset as a whole (ii) generalize out of sample, and (iii) yield interesting discussions about potential bias inherent in these datasets.

翻译：近些年来,人们不断研究机器学习模型的公正性,特别是量化和消除对受保护分组的偏差。一行工作通过引入“ 丰富分组” 的概念,将受保护分组的概念概括为简单离散类以外的受保护分组的概念,并试图对经校准或平衡这些较富裕分组类别误差率的模型进行培训。在很大程度上,已经开发出具有分类 h 和测试点 x 的本地示范解释方法,将预测h(x) 的影响力归属于x 的个体特征。这引起了一个自然问题: 本地模型解释方法将不同受保护分组的平均值赋予不同特征的重要性,我们能否有效地检测这些差异? 如果模型高度重视某个特定受保护分组的某个特定特征,而不是总体数据集(反之反),这可能是预测模型或基本数据生成过程的偏差的潜在指标,而且最起码的诊断表明需要域专家更深入地分析。在本文中,我们正式引入了不同特征的特征评估重要性概念概念,即: 深度数据分析的深度分析(FID) 总体而言, 深度数据分析的深度分析是整个基础(IFID) 和整个数据分组, 显示我们的数据分析的深度分析的深度分析的深度分析(IFID) 的深度分析的深度分析的深度分析,其基础,其基础,其深度分析的深度分析的深度分析,其深度分析的深度分析,其基础的深度分析,其基础分析,其基础,其基础分析,其基础,其基础,其基础,其基础分析,其基础,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础是第4级分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础分析,其基础</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日