Understanding and explaining the decisions of neural networks are critical to building trust, rather than relying on them as black box algorithms. Post-hoc evaluation techniques, such as Grad-CAM, enable humans to inspect the spatial regions responsible for a particular network decision. However, it is shown that such explanations are not always consistent with human priors, such as consistency across image transformations. Given an interpretation algorithm, e.g., Grad-CAM, we introduce a novel training method to train the model to produce more consistent explanations. Since obtaining the ground truth for a desired model interpretation is not a well-defined task, we adopt ideas from contrastive self-supervised learning and apply them to the interpretations of the model rather than its embeddings. Explicitly training the network to produce more reasonable interpretations and subsequently evaluating those interpretations will enhance our ability to trust the network. We show that our method, Contrastive Grad-CAM Consistency (CGC), results in Grad-CAM interpretation heatmaps that are consistent with human annotations while still achieving comparable classification accuracy. Moreover, since our method can be seen as a form of regularizer, on limited-data fine-grained classification settings, our method outperforms the baseline classification accuracy on Caltech-Birds, Stanford Cars, VGG Flowers, and FGVC-Aircraft datasets. In addition, because our method does not rely on annotations, it allows for the incorporation of unlabeled data into training, which enables better generalization of the model. Our code is publicly available.
翻译:理解和解释神经网络的决定对于建立信任至关重要,而不是依赖它们作为黑盒算法。 热后评估技术,如Grad-CAM,使人类能够检查对特定网络决定负责的空间区域。 但是,它表明,这种解释并不总是与人类的前科一致,例如图像转换的一致性。根据一种解释算法,例如Grad-CAM,我们引入了一种新的培训方法来培训模型,以得出更加一致的解释。由于获得理想模型解释的地面真象并非一项明确界定的任务,我们采用了对比式自我监督学习的理念,并将这些理念应用于对模型的解释,而不是其嵌入式。 明确培训网络以产生更合理的解释并随后评估这些解释,将增强我们信任网络的能力。 根据我们的解释算法,例如Grad-CAM,我们采用了一种比较的格拉德-CAM解释性图表,从而使得人们能够进行更一致的分类。 此外,由于我们采用的方法是对比的自我监督的自我监督的自我监督的自我监督的自我监督的自我监督的学习方法,因此,在常规的卡路基数据分类中可以将我们的数据转换为一种不精确的方法。