Explaining black-box Artificial Intelligence (AI) models is a cornerstone for trustworthy AI and a prerequisite for its use in safety critical applications such that AI models can reliably assist humans in critical decisions. However, instead of trying to explain our models post-hoc, we need models which are interpretable-by-design built on a reasoning process similar to humans that exploits meaningful high-level concepts such as shapes, texture or object parts. Learning such concepts is often hindered by its need for explicit specification and annotation up front. Instead, prototype-based learning approaches such as ProtoPNet claim to discover visually meaningful prototypes in an unsupervised way. In this work, we propose a set of properties that those prototypes have to fulfill to enable human analysis, e.g. as part of a reliable model assessment case, and analyse such existing methods in the light of these properties. Given a 'Guess who?' game, we find that these prototypes still have a long way ahead towards definite explanations. We quantitatively validate our findings by conducting a user study indicating that many of the learnt prototypes are not considered useful towards human understanding. We discuss about the missing links in the existing methods and present a potential real-world application motivating the need to progress towards truly human-interpretable prototypes.
翻译:解释黑匣子人工智能(AI)模型是值得信赖的AI的基石,也是在安全关键应用中使用该模型的先决条件,因此AI模型可以可靠地协助人类做出关键决定。然而,我们不是试图解释我们的模型后热后决定,而是需要基于类似人类的推理过程的可解释性设计模型,这些模型开发出有意义的高层次概念,如形状、质地或物体部件。学习这些概念往往因其需要明确的规格和前面的说明而受阻。相反,基于原型的学习方法,如ProtoPNet(ProtoPNet)要求以不受监督的方式发现具有视觉意义的原型模型,从而可以可靠地帮助人类做出重要决定。在这项工作中,我们提出了一套这些原型必须实现的特性,以便能够进行人类分析,例如作为可靠的模型评估案例的一部分,并根据这些特性分析现有的方法。鉴于“Guess who?”游戏,我们发现这些原型仍然远超前很长的路程才能得到确切的解释。我们通过进行用户研究,从数量上证实我们的调查结果,表明许多所学过的原型模型对于实现人类真正的理解没有价值。我们讨论关于人类目前发展模式需要真正改变的路径。