Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction. A popular family of saliency methods utilize gradient information. In this work, we empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods. Though these methods reflect visually salient information in the input, they do not explain the model prediction anymore as the generated saliency maps are insensitive to the predicted output and are insensitive to model parameter randomization. Specifically for methods that aggregate the gradients of a chosen layer such as GradCAM++ and FullGrad, exclusively aggregating positive gradients is detrimental. We further support this by proposing several variants of aggregation methods with positive handling of gradient information. For methods that backpropagate gradient information such as LRP, RectGrad, and Guided Backpropagation, we show the destructive effect of exclusively propagating positive gradient information.
翻译:测量方法通过显示输入元素对于预测神经网络的重要性来解释对神经网络的预测。 流行的突出方法组群使用梯度信息。 在这项工作中,我们从经验上表明,两种处理梯度信息的方法,即正汇总和正传播的方法,打破了这些方法。虽然这些方法反映了输入中的视觉突出信息,但它们不再解释模型预测,因为生成的突出地图对预测输出不敏感,对模型参数随机化不敏感。具体地说,对于将选定的层,如GradCAM++和FullGrad的梯度加在一起的方法,完全汇总正梯度是有害的。我们进一步支持这一方法,提出若干组合方法的变式,对梯度信息进行正处理。对于支持梯度信息的方法,如LRP、RaptGrad和向导反偏移,我们展示了完全传播正梯度信息的破坏性效应。