Feature attributions based on the Shapley value are popular for explaining machine learning models; however, their estimation is complex from both a theoretical and computational standpoint. We disentangle this complexity into two factors: (1)~the approach to removing feature information, and (2)~the tractable estimation strategy. These two factors provide a natural lens through which we can better understand and compare 24 distinct algorithms. Based on the various feature removal approaches, we describe the multiple types of Shapley value feature attributions and methods to calculate each one. Then, based on the tractable estimation strategies, we characterize two distinct families of approaches: model-agnostic and model-specific approximations. For the model-agnostic approximations, we benchmark a wide class of estimation approaches and tie them to alternative yet equivalent characterizations of the Shapley value. For the model-specific approximations, we clarify the assumptions crucial to each method's tractability for linear, tree, and deep models. Finally, we identify gaps in the literature and promising future research directions.
翻译:基于沙普利值的特性特性属性在解释机器学习模型方面很受欢迎;然而,从理论和计算角度来看,它们的估计都是复杂的。我们将这一复杂性分解成两个因素:(1) 去除特征信息的方法,(2) 分解战略。这两个因素提供了自然透镜,我们可以通过它更好地了解和比较24种不同的算法。根据各种特征清除方法,我们描述了多种类型的显普价值特性属性属性属性和计算每种特征的方法。然后,根据可移动的估计战略,我们区分了两种不同的方法:模型 -- -- 不可知性和模型 -- -- 特定近似值。对于模型 -- -- 类近似值,我们衡量了广泛的估计方法,并将它们与替代的但等同的沙普利值特征联系起来。对于模型特定近似值,我们澄清了每种方法对线性、树和深层模型的可移动性至关重要的假设。最后,我们确定了文献中的空白和有希望的未来研究方向。