We present a simple yet highly generalizable method for explaining interacting parts within a neural network's reasoning process. First, we design an algorithm based on cross derivatives for computing statistical interaction effects between individual features, which is generalized to both 2-way and higher-order (3-way or more) interactions. We present results side by side with a weight-based attribution technique, corroborating that cross derivatives are a superior metric for both 2-way and higher-order interaction detection. Moreover, we extend the use of cross derivatives as an explanatory device in neural networks to the computer vision setting by expanding Grad-CAM, a popular gradient-based explanatory tool for CNNs, to the higher order. While Grad-CAM can only explain the importance of individual objects in images, our method, which we call Taylor-CAM, can explain a neural network's relational reasoning across multiple objects. We show the success of our explanations both qualitatively and quantitatively, including with a user study. We will release all code as a tool package to facilitate explainable deep learning.
翻译:在神经网络的推理过程中,我们提出了一个简单而非常普遍的方法来解释互动部分。首先,我们设计一种基于交叉衍生物的算法,用于计算各个特性之间的统计互动效应,这种算法普遍适用于双向和更高顺序(3个或更多)的相互作用。我们用一种基于重量的归属技术,同时提出结果,证实交叉衍生物是双向和更高层次互动检测的优劣衡量标准。此外,我们还将跨衍生物作为神经网络解释装置的使用扩大到计算机视野设置,将Grad-CAM(一个流行的CNN的梯度解释工具)扩大到更高层次。虽然Grad-CAM只能解释图像中单个物体的重要性,但我们称之为Taylor-CAM(Taylor-CAM)的方法可以解释一个神经网络在多个物体上的关系推理。我们用质量和数量两方面的解释都取得了成功,包括用户研究。我们将发布所有代码作为工具包,以便于解释深层次学习。