Shapley value has recently become a popular way to explain the predictions of complex and simple machine learning models. This paper is discusses the factors that influence Shapley value. In particular, we explore the relationship between the distribution of a feature and its Shapley value. We extend our analysis by discussing the difference that arises in Shapley explanation for different predicted outcomes from the same model. Our assessment is that Shapley value for particular feature not only depends on its expected mean but on other moments as well such as variance and there are disagreements for baseline prediction, disagreements for signs and most important feature for different outcomes such as probability, log odds, and binary decision generated using same linear probability model (logit/probit). These disagreements not only stay for local explainability but also affect the global feature importance. We conclude that there is no unique Shapley explanation for a given model. It varies with model outcome (Probability/Log-odds/binary decision such as accept vs reject) and hence model application.
翻译:Shapley值最近成为解释复杂和简单的机器学习模型预测的流行方式。 本文讨论影响Shapley值的因素。 特别是, 我们探讨一个特性分布及其形状值之间的关系。 我们通过讨论Shapley解释同一模型不同预测结果时产生的差异来扩展我们的分析。 我们的评估是, 特定特性的Shapley值不仅取决于其预期平均值,而且取决于其他时刻,例如差异, 基线预测存在分歧, 迹象和不同结果的最重要特征, 如概率、 日志概率和二进制决定使用相同的线性概率模型( logit/ probit) 生成。 这些分歧不仅停留在局部解释性上, 而且还影响全球特性的重要性。 我们的结论是, 特定模型没有独特的形状解释。 它与模型结果( 概率/ Log-odds/ binary决定, 如接受 vs 拒绝) 不同, 并因此与模型应用不同 。