Shapley value is a popular approach for measuring the influence of individual features. While Shapley feature attribution is built upon desiderata from game theory, some of its constraints may be less natural in certain machine learning settings, leading to unintuitive model interpretation. In particular, the Shapley value uses the same weight for all marginal contributions -- i.e. it gives the same importance when a large number of other features are given versus when a small number of other features are given. This property can be problematic if larger feature sets are more or less informative than smaller feature sets. Our work performs a rigorous analysis of the potential limitations of Shapley feature attribution. We identify simple settings where the Shapley value is mathematically suboptimal by assigning larger attributions for less influential features. Motivated by this observation, we propose WeightedSHAP, which generalizes the Shapley value and learns which marginal contributions to focus directly from data. On several real-world datasets, we demonstrate that the influential features identified by WeightedSHAP are better able to recapitulate the model's predictions compared to the features identified by the Shapley value.
翻译:形状值是衡量个别特性影响的流行方法。 虽然形状特征属性是建立在游戏理论的偏差上, 但在某些机器学习环境中,有些限制可能不那么自然, 导致不直观的模型解释。 特别是, 形状值对所有边际贡献使用相同的权重, 也就是说, 当给定了大量其他特性时, 它具有同等的重要性, 而当给定了少量其他特性时。 如果较大的特性组比较小的特性组要多或少信息, 这个属性可能会有问题。 我们的工作对形状特征属性的潜在限制进行严格分析。 我们通过为较不具有影响力的特性指定更大的属性, 找出了形状值在数学上不完美的简单环境。 我们受此观察的启发, 我们提出了WeightedSHAP, 它概括了形状值, 并学习了哪些边际贡献可以直接从数据中得到关注。 在几个真实世界数据集上, 我们证明 WeightedSHAP 所查明的有影响力的特征比Shaply 值所查明的特征更能重新概括模型预测。