变量重要性评估方法：避免无法实现数据 (Variable importance without impossible data) - 专知论文

会员服务 ·

0

博弈理论 · 预测算法 · 黑盒 · 博弈 · 算法 ·

2023 年 4 月 13 日

Variable importance without impossible data

翻译：变量重要性评估方法：避免无法实现数据

Masayoshi Mase,Art B. Owen,Benjamin B. Seiler

The most popular methods for measuring importance of the variables in a black box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple subjects. These inputs can be unlikely, physically impossible, or even logically impossible. As a result, the predictions for such cases can be based on data very unlike any the black box was trained on. We think that users cannot trust an explanation of the decision of a prediction algorithm when the explanation uses such values. Instead we advocate a method called Cohort Shapley that is grounded in economic game theory and unlike most other game theoretic methods, it uses only actually observed data to quantify variable importance. Cohort Shapley works by narrowing the cohort of subjects judged to be similar to a target subject on one or more features. We illustrate it on an algorithmic fairness problem where it is essential to attribute importance to protected variables that the model was not trained on.

翻译：摘要：测量黑盒预测算法中变量重要性的最常见方法是利用合成输入数据，其中包括多个受试者的预测变量。这些输入数据可能是不可能的、不合理的，甚至逻辑上不可能的。因此，对这些情况的预测可能基于与黑盒算法的训练数据极为不同的数据。我们认为，当解释使用这些值时，用户不能信任对一个预测算法的决策进行解释。相反，我们倡导一种名为“同伴Shapley”的方法，该方法基于经济博弈理论，与大多数其他博弈理论方法不同，它仅使用实际观察到的数据来量化变量重要性。同伴Shapley的工作原理是缩小与目标受试者在一个或多个特征上被认为相似的受试者队列。我们将其应用于算法公平性问题，其中必须将重要性归因于未被模型训练的受保护变量。（翻译中英文专业术语未译出）

0

相关内容

博弈理论

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

《校准自主性中的信任》2022最新16页slides

《校准自主性中的信任》2022最新16页slides

专知会员服务

20+阅读 · 2022年12月7日

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

51+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

【UAI2021最佳论文】利用Fisher信息测量机器学习模型中的数据泄漏

专知会员服务

17+阅读 · 2021年8月5日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

删失数据超高维共线性模型的变量选择

国家自然科学基金

0+阅读 · 2017年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

全参数完美匹配层的人工实现

国家自然科学基金

0+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

miR-181a诱导树突状细胞负向免疫调控在动脉粥样硬化中的意义与机制

国家自然科学基金

0+阅读 · 2012年12月31日

超高维数据的变量筛选方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于蛋白质组学和代谢组学整合分析的Paraconiothyrium variable GHJ-4降解木质素的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于纵向数据的秩回归和分位数回归的有效参数估计

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

What Can Be Learnt With Wide Convolutional Neural Networks?

Arxiv

0+阅读 · 2023年5月31日

Lasso in infinite dimension: application to variable selection in functional multivariate linear regression

Arxiv

0+阅读 · 2023年5月31日

Model averaging approaches to data subset selection

Arxiv

0+阅读 · 2023年5月30日

Flexible Enlarged Conjugate Gradient Methods

Arxiv

0+阅读 · 2023年5月30日

Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

Arxiv

0+阅读 · 2023年5月28日

Conditional expectation with regularization for missing data imputation

Arxiv

0+阅读 · 2023年5月27日

Levin Tree Search with Context Models

Arxiv

0+阅读 · 2023年5月26日

On efficient covariate adjustment selection in causal effect estimation

Arxiv

0+阅读 · 2023年5月26日

On Consistent Bayesian Inference from Synthetic Data

Arxiv

0+阅读 · 2023年5月26日

Quantile Importance Sampling

Arxiv

0+阅读 · 2023年5月26日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

《校准自主性中的信任》2022最新16页slides

《校准自主性中的信任》2022最新16页slides

专知会员服务

20+阅读 · 2022年12月7日

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

51+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

【UAI2021最佳论文】利用Fisher信息测量机器学习模型中的数据泄漏

专知会员服务

17+阅读 · 2021年8月5日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】迈向具有高维结果的可靠且稳健的因果推断

《美海军分布式海上作战（DMO）概念：最新情况》

Gemini 2.5：推动前沿，具备先进推理、多模态、长上下文及下一代智能体能力

【ICML2025教程】联想记忆的现代方法

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

What Can Be Learnt With Wide Convolutional Neural Networks?

Arxiv

0+阅读 · 2023年5月31日

Lasso in infinite dimension: application to variable selection in functional multivariate linear regression

Arxiv

0+阅读 · 2023年5月31日

Model averaging approaches to data subset selection

Arxiv

0+阅读 · 2023年5月30日

Flexible Enlarged Conjugate Gradient Methods

Arxiv

0+阅读 · 2023年5月30日

Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

Arxiv

0+阅读 · 2023年5月28日

Conditional expectation with regularization for missing data imputation

Arxiv

0+阅读 · 2023年5月27日

Levin Tree Search with Context Models

Arxiv

0+阅读 · 2023年5月26日

On efficient covariate adjustment selection in causal effect estimation

Arxiv

0+阅读 · 2023年5月26日

On Consistent Bayesian Inference from Synthetic Data

Arxiv

0+阅读 · 2023年5月26日

Quantile Importance Sampling

Arxiv

0+阅读 · 2023年5月26日

相关基金

删失数据超高维共线性模型的变量选择

国家自然科学基金

0+阅读 · 2017年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

全参数完美匹配层的人工实现

国家自然科学基金

0+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

miR-181a诱导树突状细胞负向免疫调控在动脉粥样硬化中的意义与机制

国家自然科学基金

0+阅读 · 2012年12月31日

超高维数据的变量筛选方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于蛋白质组学和代谢组学整合分析的Paraconiothyrium variable GHJ-4降解木质素的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于纵向数据的秩回归和分位数回归的有效参数估计

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员