A plethora of methods have been proposed to explain howdeep neural networks reach a decision but comparativelylittle effort has been made to ensure that the explanationsproduced by these methods are objectively relevant. Whiledesirable properties for a good explanation are easy to come,objective measures have been harder to derive. Here, we pro-pose two new measures to evaluate explanations borrowedfrom the field of algorithmic stability: relative consistencyReCo and mean generalizability MeGe. We conduct severalexperiments on multiple image datasets and network archi-tectures to demonstrate the benefits of the proposed measuresover representative methods. We show that popular fidelitymeasures are not sufficient to guarantee good explanations.Finally, we show empirically that 1-Lipschitz networks pro-vide general and consistent explanations, regardless of theexplanation method used, making them a relevant directionfor explainability.
翻译:提出了许多方法来解释神经神经网络是如何达到决定的,但相对而言,为确保这些方法所得出的解释在客观上具有相关性,已经做出了相对较少的努力。虽然很好解释的特性很容易出现,但客观措施比较难得出。在这里,我们提出两项新措施来评价从算法稳定性领域借用的解释:相对一致性ReCo和一般性MeGe。我们在多个图像数据集和网络古迹上进行了若干次实验,以展示拟议措施对代表性方法的益处。我们表明,大众忠诚度措施不足以保证解释良好。最后,我们从经验上表明,1-Lipschitz网络有利于一般性和一致的解释,而不论使用的解释方法如何,都使这些解释成为相关的方向。