Explainable AI has attracted much research attention in recent years with feature attribution algorithms, which compute "feature importance" in predictions, becoming increasingly popular. However, there is little analysis of the validity of these algorithms as there is no "ground truth" in the existing datasets to validate their correctness. In this work, we develop a method to quantitatively evaluate the correctness of XAI algorithms by creating datasets with known explanation ground truth. To this end, we focus on the binary classification problems. String datasets are constructed using formal language derived from a grammar. A string is positive if and only if a certain property is fulfilled. Symbols serving as explanation ground truth in a positive string are part of an explanation if and only if they contributes to fulfilling the property. Two popular feature attribution explainers, Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), are used in our experiments.We show that: (1) classification accuracy is positively correlated with explanation accuracy; (2) SHAP provides more accurate explanations than LIME; (3) explanation accuracy is negatively correlated with dataset complexity.
翻译:近些年来,可解释的AI吸引了许多研究关注,其特性属性算法在预测中计算出“地物重要性”,这种算法越来越受欢迎。然而,对于这些算法的有效性几乎没有分析,因为现有数据集中没有“地面真相”来证实其正确性。在这项工作中,我们开发了一种方法,通过创建具有已知解释地面真相的数据集,从数量上评价XAI算法的正确性。为此,我们侧重于二进制分类问题。字符串是使用语法衍生的正式语言构建的。字符串是肯定的,如果并且只有在符合某一属性的情况下。符号性符号性作为正弦字串的解释性事实是解释的一部分,如果并且只有在它们有助于实现属性的情况下。两种流行特性属性解释,即本地易变模型解释(LIME)和Shamapley Additive Explectations(SHAP),用于我们的实验中。我们显示:(1) 分类准确性与解释准确性与解释性有正面关联;(2) SHAP提供比LME更准确的解释;(3) 精确性解释与数据的复杂性是负面的。