Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation mapping (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient while the performance is heavily affected by gradient vanishing and exploding. In contrast, gradient-free CAMs can avoid computing gradients to produce more understandable results. However, existing gradient-free CAMs are quite time-consuming because hundreds of forward interference per image are required. In this paper, we proposed Cluster-CAM, an effective and efficient gradient-free CNN interpretation algorithm. Cluster-CAM can significantly reduce the times of forward propagation by splitting the feature maps into clusters in an unsupervised manner. Furthermore, we propose an artful strategy to forge a cognition-base map and cognition-scissors from clustered feature maps. The final salience heatmap will be computed by merging the above cognition maps. Qualitative results conspicuously show that Cluster-CAM can produce heatmaps where the highlighted regions match the human's cognition more precisely than existing CAMs. The quantitative evaluation further demonstrates the superiority of Cluster-CAM in both effectiveness and efficiency.
翻译:尽管计算机视觉中的神经神经网络(CNNs)取得了巨大成功,但CNN机制仍然缺乏清晰的解释。目前,班级激活映射(CAM)是解释CNN决定的著名视觉化技术,吸引了越来越多的人注意。基于渐变的CAM效率很高,而其性能却受到梯度消失和爆炸的严重影响。相反,无梯度的CAM可以避免计算梯度以产生更易理解的结果。然而,现有的无梯度CAM非常耗时,因为需要数百张图像的远端干扰。在本文中,我们建议CM(CAM)是一个高效益和高效的无梯度CNN解算法。CAM可以通过以不受监督的方式将地图分割成组群集来大大缩短前期传播的时间。此外,我们提出了一个巧妙的战略来从集地图中绘制认知基础地图和认知感测精度精度精度精度的精度。最后的显性热度映射图将通过合并上述认知图来计算。定性的结果明显地显示,CAM(CAM)能够更精确地显示Cam-Cmas-clodialality展示现有区域的效率。