Due to the absence of ground truth, objective evaluation of explainability methods is an essential research direction. So far, the vast majority of evaluations can be summarized into three categories, namely human evaluation, sensitivity testing, and salinity check. This work proposes a novel evaluation methodology from the perspective of generalizability. We employ an Autoencoder to learn the distributions of the generated explanations and observe their learnability as well as the plausibility of the learned distributional features. We first briefly demonstrate the evaluation idea of the proposed approach at LIME, and then quantitatively evaluate multiple popular explainability methods. We also find that smoothing the explanations with SmoothGrad can significantly enhance the generalizability of explanations.
翻译:由于缺乏地面真相,客观地评价解释方法是一个必要的研究方向,迄今为止,绝大多数评价可以归纳为三类,即人的评价、敏感性测试和盐度检查。这项工作从可概括性的角度提出了一种新的评价方法。我们使用自动编码器来了解所产生解释的分布情况,并观察其可学习性以及所学到的分布特征的可信赖性。我们首先简要地展示了LIME拟议方法的评价想法,然后从数量上评价了多种通用解释方法。我们还发现,用LumlaGrad来顺利解释可以大大提高解释的可概括性。