EXplainable Artificial Intelligence (XAI) is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the subject, yet XAI still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the combination of evidence stemming from the model and its input-output mapping, and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's inner workings and decision-making process) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are operationalized and it provides new insight into common explanation methods that we analyze as case studies.
翻译:人工智能(XAI)是人工智能界中一个充满活力的研究课题,在方法和领域之间引起了越来越多的兴趣。关于这个主题,已经写了很多,但XAI仍然缺乏共同的术语和能够提供结构合理性解释的框架。在我们的工作中,我们通过提出一个新的解释定义来解决这些问题,该解释是对文献中可以找到的东西的综合。我们认识到,解释不是原子,而是来自模型及其投入-产出映射的证据的组合,以及人类对这一证据的解释。此外,我们对忠实性(即解释是对模型内部工作和决策过程的真实描述)和可信赖性(即解释对用户来说多么令人信服)。我们提出的理论框架简化了这些属性是如何运作的,并为我们作为案例研究分析的共同解释方法提供了新的洞见。