Saliency methods attempt to explain deep neural networks by highlighting the most salient features of a sample. Some widely used methods are based on a theoretical framework called Deep Taylor Decomposition (DTD), which formalizes the recursive application of the Taylor Theorem to the network's layers. However, recent work has found these methods to be independent of the network's deeper layers and appear to respond only to lower-level image structure. Here, we investigate the DTD theory to better understand this perplexing behavior and found that the Deep Taylor Decomposition is equivalent to the basic gradient$\times$input method when the Taylor root points (an important parameter of the algorithm chosen by the user) are locally constant. If the root points are locally input-dependent, then one can justify any explanation. In this case, the theory is under-constrained. In an empirical evaluation, we find that DTD roots do not lie in the same linear regions as the input - contrary to a fundamental assumption of the Taylor theorem. The theoretical foundations of DTD were cited as a source of reliability for the explanations. However, our findings urge caution in making such claims.
翻译:萨利因方法试图通过突出抽样中最突出的特征来解释深神经网络。 一些广泛使用的方法基于称为深泰勒分解(DTD)的理论框架,该框架将泰勒理论对网络层的循环应用正式化。 但是,最近的工作发现,这些方法独立于网络的更深层,似乎只对较低层次的图像结构作出反应。在这里,我们调查DTD理论,以更好地了解这种令人困惑的行为,发现深泰勒分解相当于基本梯度$\time $put法,当泰勒根点(用户选择的算法的重要参数)是当地常态时,Taylor根点(用户选择的算法的重要参数)相当于基本梯度$put法。如果根点依赖于当地投入,那么人们可以解释。在本案中,理论没有受到足够的限制。在经验评估中,我们发现DTD根与输入的线性区域不同,这与泰勒理论的基本假设相反。 DTD的理论基础被引用为解释的可靠来源。然而,我们的调查结果敦促谨慎提出这样的主张。