The Shapley value is widely regarded as a trustworthy attribution metric. However, when people use Shapley values to explain the attribution of input variables of a deep neural network (DNN), it usually requires a very high computational cost to approximate relatively accurate Shapley values in real-world applications. Therefore, we propose a novel network architecture, the HarsanyiNet, which makes inferences on the input sample and simultaneously computes the exact Shapley values of the input variables in a single forward propagation. The HarsanyiNet is designed on the theoretical foundation that the Shapley value can be reformulated as the redistribution of Harsanyi interactions encoded by the network.
翻译:Shapley值被广泛视为可信的归因度量。然而,当人们使用Shapley值来解释深度神经网络(DNN)的输入变量的归因时,通常需要在实际应用中以非常高的计算成本来近似相对准确的Shapley值。因此,我们提出了一种新型网络架构,即HarsanyiNet,在单个前向传输中对输入样本进行推断,并同时计算输入变量的精确Shapley值。 HarsanyiNet是在理论基础上设计的,即Shapley值可以重构为由网络编码的Harsanyi交互的重新分配。