Prior gradient-based attribution-map methods rely on handcrafted propagation rules for the non-linear/activation layers during the backward pass, so as to produce gradients of the input and then the attribution map. Despite the promising results achieved, such methods are sensitive to the non-informative high-frequency components and lack adaptability for various models and samples. In this paper, we propose a dedicated method to generate attribution maps that allow us to learn the propagation rules automatically, overcoming the flaws of the handcrafted ones. Specifically, we introduce a learnable plugin module, which enables adaptive propagation rules for each pixel, to the non-linear layers during the backward pass for mask generating. The masked input image is then fed into the model again to obtain new output that can be used as a guidance when combined with the original one. The introduced learnable module can be trained under any auto-grad framework with higher-order differential support. As demonstrated on five datasets and six network architectures, the proposed method yields state-of-the-art results and gives cleaner and more visually plausible attribution maps.
翻译:先前的基于梯度的归属图方法依赖于在后传过程中手动制作的非线性/活动层的传播规则,以便产生输入梯度,然后绘制归属图。尽管取得了有希望的成果,这些方法对非信息性高频组件十分敏感,缺乏对各种模型和样本的适应性。在本文件中,我们提出了一个生成属性图的专门方法,使我们能够自动学习传播规则,克服手动生成规则的缺陷。具体地说,我们引入了一个可学习插件模块,使每个像素的适应性传播规则,进入后传口生成时的非线性层。然后,将遮罩输入图像再次输入模型,以获得新产出,在与原始模型和样本合并时可以用作指导。引入的可学习模块可以在任何自动升级框架下培训,得到更高级差异支持。正如五个数据集和六个网络结构所显示的那样,拟议方法产生最先进的结果,并给出更清晰、更直观的归因图。