Attribution methods are a popular class of explainability methods that use heatmaps to depict the most important areas of an image that drive a model decision. Nevertheless, recent work has shown that these methods have limited utility in practice, presumably because they only highlight the most salient parts of an image (i.e., 'where' the model looked) and do not communicate any information about 'what' the model saw at those locations. In this work, we try to fill in this gap with CRAFT -- a novel approach to identify both 'what' and 'where' by generating concept-based explanations. We introduce 3 new ingredients to the automatic concept extraction literature: (i) a recursive strategy to detect and decompose concepts across layers, (ii) a novel method for a more faithful estimation of concept importance using Sobol indices, and (iii) the use of implicit differentiation to unlock Concept Attribution Maps. We conduct both human and computer vision experiments to demonstrate the benefits of the proposed approach. We show that our recursive decomposition generates meaningful and accurate concepts and that the proposed concept importance estimation technique is more faithful to the model than previous methods. When evaluating the usefulness of the method for human experimenters on a human-defined utility benchmark, we find that our approach significantly improves on two of the three test scenarios (while none of the current methods including ours help on the third). Overall, our study suggests that, while much work remains toward the development of general explainability methods that are useful in practical scenarios, the identification of meaningful concepts at the proper level of granularity yields useful and complementary information beyond that afforded by attribution methods.
翻译:归因方法是一个通俗的解释方法类别,它使用热图来描述驱动模型决定的最重要图像领域。然而,最近的工作表明,这些方法在实践中的作用有限,大概是因为这些方法只突出图像最突出的部分(即“模型在哪里”查看过),并且不传达关于“什么”模型在这些地点所看到的模式的任何信息。在这项工作中,我们试图与CRAFT一起填补这一空白,这是一种新颖的方法,通过产生基于概念的解释,找出“什么”和“哪里”的“什么”两个方面,从而找出“什么”和“哪里”这两个最重要的方面。我们在自动概念提取文献中引入了3个新的要素:(一) 一种探测和分解跨层概念的循环战略,可能是因为这些方法仅仅突出了图像中最突出的部分(即“哪里”模型的“哪里”模型),而没有传达任何关于“什么”模型重要性的隐含差异的信息。 我们进行人类和计算机的视觉实验,以展示拟议方法的好处。我们通过产生有意义和准确的概念概念,以及拟议的概念重要性估计技术比以往方法更忠实于模型。在评估三个层次方法时,要大大地说明我们如何评估使用这个方法。在研究中,要用什么方法来评估人类的效益方法的效用。在研究中,我们目前的方法的效用,在研究中,要用什么是用来衡量。