The recent success of deep learning models in solving complex problems and in different domains has increased interest in understanding what they learn. Therefore, different approaches have been employed to explain these models, one of which uses human-understandable concepts as explanations. Two examples of methods that use this approach are Network Dissection and Compositional explanations. The former explains units using atomic concepts, while the latter makes explanations more expressive, replacing atomic concepts with logical forms. While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement, and their evaluation is often based on the same metric that is optimized during the search-process and on the usage of hyper-parameters to be tuned. In this paper, we propose to use as evaluation metric the Detection Accuracy, which measures units' consistency of detection of their assigned explanations. We show that this metric (1) evaluates explanations of different lengths effectively, (2) can be used as a stopping criterion for the compositional explanation search, eliminating the explanation length hyper-parameter, and (3) exposes new specialized units whose length 1 explanations are the perceptual abstractions of their longer explanations.
翻译:解决复杂问题和不同领域的深层次学习模式最近取得成功,使人们对了解它们所学到的知识更加感兴趣。因此,采用了不同的方法来解释这些模型,其中一种是人所理解的概念作为解释。使用这种方法的两个实例是网络分解和构成解释。前者解释使用原子概念的单位,而后者则使原子概念的解释更加直观,以逻辑形式取代原子概念。虽然直观、逻辑形式比原子概念更加丰富,但不清楚如何量化这一改进,它们的评价往往基于搜索过程中使用的同一指标,以及需要调整的超参数的使用。在本文件中,我们提议使用探测精确度作为评价指标,用以衡量单位对所指定解释的检测的一致性。我们表明,该指标(1) 有效地评价不同长度的解释,(2) 可以用作进行解释性解释搜索的停止标准,消除解释性长度超参数,(3) 暴露新的专门单位,其长度1个解释的长度是其较长解释的抽象性。