Shapley values have become one of the most popular feature attribution explanation methods. However, most prior work has focused on post-hoc Shapley explanations, which can be computationally demanding due to its exponential time complexity and preclude model regularization based on Shapley explanations during training. Thus, we propose to incorporate Shapley values themselves as latent representations in deep models thereby making Shapley explanations first-class citizens in the modeling paradigm. This intrinsic explanation approach enables layer-wise explanations, explanation regularization of the model during training, and fast explanation computation at test time. We define the Shapley transform that transforms the input into a Shapley representation given a specific function. We operationalize the Shapley transform as a neural network module and construct both shallow and deep networks, called ShapNets, by composing Shapley modules. We prove that our Shallow ShapNets compute the exact Shapley values and our Deep ShapNets maintain the missingness and accuracy properties of Shapley values. We demonstrate on synthetic and real-world datasets that our ShapNets enable layer-wise Shapley explanations, novel Shapley regularizations during training, and fast computation while maintaining reasonable performance. Code is available at https://github.com/inouye-lab/ShapleyExplanationNetworks.
翻译:Shapley 值已成为最受欢迎的特性属性解释方法之一。 然而, 大部分先前的工作都集中在 hoc Shapley 后 Shapley 解释上, 由于其指数化的时间复杂性, 可以计算要求很高, 并且排除基于 Shapley 解释的模型。 因此, 我们提议将 Shapley 本身作为深层模型的潜在表达方式, 从而使Shapley 解释成为深层模型中的第一等公民。 这种内在解释方法可以让 Shaple 解释, 在培训期间解释模型中解释模型, 并快速计算模型。 我们定义了将输入转换成 Shapley 代表的 Shapley 转换, 因为它具有指数性能, 我们把 Shapley 转换成神经网络模块, 并构建浅深层和深层的网络。 我们证明我们的Shapley Shapley 网络在培训期间计算精确的 Shapley/ Explain Productions。