Attribution methods, which employ heatmaps to identify the most influential regions of an image that impact model decisions, have gained widespread popularity as a type of explainability method. However, recent research has exposed the limited practical value of these methods, attributed in part to their narrow focus on the most prominent regions of an image -- revealing "where" the model looks, but failing to elucidate "what" the model sees in those areas. In this work, we try to fill in this gap with CRAFT -- a novel approach to identify both "what" and "where" by generating concept-based explanations. We introduce 3 new ingredients to the automatic concept extraction literature: (i) a recursive strategy to detect and decompose concepts across layers, (ii) a novel method for a more faithful estimation of concept importance using Sobol indices, and (iii) the use of implicit differentiation to unlock Concept Attribution Maps. We conduct both human and computer vision experiments to demonstrate the benefits of the proposed approach. We show that the proposed concept importance estimation technique is more faithful to the model than previous methods. When evaluating the usefulness of the method for human experimenters on a human-centered utility benchmark, we find that our approach significantly improves on two of the three test scenarios. Our code is freely available at github.com/deel-ai/Craft.
翻译:归因法通过热图确定影响模型决策最大的图像区域,已经成为一种广受欢迎的解释性方法。然而,最近的研究揭示了这些方法的实际价值有限,部分原因是它们只关注图像最突出的区域--揭示模型“看哪里”,但未能说明模型在这些区域中看到了“什么”。本文尝试填补这一空白,通过生成基于概念的解释,同时确定“什么”和“在哪里”。我们引入了三个新材料:(i)一种递归策略,用于检测和分解不同层次的概念,(ii)一种用Sobol指标更准确地估计概念重要性的新方法,和(iii)利用隐式微分解开概念归因地图。我们进行了人类和计算机视觉实验,以展示所提出方法的好处。我们展示了概念重要性估计技术相对于之前的方法更忠实于模型。在对人类实验者的有人类中心的效用基准测试方法的有用性进行评估时,我们发现我们的方法在三个测试场景中有两个具有显著的改进。我们的代码可以在github.com/deel-ai/Craft上免费获得。