Explaining deep convolutional neural networks has been recently drawing increasing attention since it helps to understand the networks' internal operations and why they make certain decisions. Saliency maps, which emphasize salient regions largely connected to the network's decision-making, are one of the most common ways for visualizing and analyzing deep networks in the computer vision community. However, saliency maps generated by existing methods cannot represent authentic information in images due to the unproven proposals about the weights of activation maps which lack solid theoretical foundation and fail to consider the relations between each pixel. In this paper, we develop a novel post-hoc visual explanation method called Shap-CAM based on class activation mapping. Unlike previous gradient-based approaches, Shap-CAM gets rid of the dependence on gradients by obtaining the importance of each pixel through Shapley value. We demonstrate that Shap-CAM achieves better visual performance and fairness for interpreting the decision making process. Our approach outperforms previous methods on both recognition and localization tasks.
翻译:解释深刻的共生神经网络最近日益引起人们的注意,因为它有助于理解这些网络的内部运作以及它们作出某些决定的原因。突出区域与网络决策大相联的突出区域突出,这些地图是计算机视觉界对深网络进行直观化和分析的最常见方法之一。然而,现有方法产生的突出地图无法在图像中代表真实信息,因为尚未证实的关于激活地图重量的建议缺乏坚实的理论基础,未能考虑每个像素之间的关系。在本文中,我们开发了一种新型的热后直观解释方法,即基于班级激活绘图的Shap-CAM。与以前基于梯度的方法不同,Shap-CAM摆脱了对梯度的依赖,通过沙普价值获得每个像的重要性。我们证明Shap-CAM在解释决策过程中取得了更好的视觉表现和公平性。我们的方法超越了先前的识别和本地化任务方法。