The field of explainable AI (XAI) has quickly become a thriving and prolific community. However, a silent, recurrent and acknowledged issue in this area is the lack of consensus regarding its terminology. In particular, each new contribution seems to rely on its own (and often intuitive) version of terms like "explanation" and "interpretation". Such disarray encumbers the consolidation of advances in the field towards the fulfillment of scientific and regulatory demands e.g., when comparing methods or establishing their compliance with respect to biases and fairness constraints. We propose a theoretical framework that not only provides concrete definitions for these terms, but it also outlines all steps necessary to produce explanations and interpretations. The framework also allows for existing contributions to be re-contextualized such that their scope can be measured, thus making them comparable to other methods. We show that this framework is compliant with desiderata on explanations, on interpretability and on evaluation metrics. We present a use-case showing how the framework can be used to compare LIME, SHAP and MDNet, establishing their advantages and shortcomings. Finally, we discuss relevant trends in XAI as well as recommendations for future work, all from the standpoint of our framework.
翻译:可解释的AI(XAI)领域迅速成为一个蓬勃和多产的社区,然而,这一领域一个沉默、反复和公认的问题是,对其术语缺乏共识,特别是每项新贡献似乎都依赖自己的(而且往往是直觉的)版本的“解释”和“解释”等术语。这种混乱的阻碍巩固了该领域在实现科学和监管要求方面的进展,例如在比较方法或确定它们遵守偏见和公平限制方面的情况时。我们提出了一个理论框架,不仅为这些术语提供了具体定义,而且还概述了提出解释和解释的所有必要步骤。这个框架还允许对现有贡献进行重新解释,以便对现有贡献的范围加以衡量,从而使其范围与其他方法具有可比性。我们表明,这个框架与解释、可解释性和评价衡量标准之间的脱节一致。我们提出了一个使用案例,说明如何利用这个框架来比较LIME、SHAP和MDNet,以确定其优缺点。最后,我们从未来工作的角度讨论XAI的有关趋势,作为我们今后工作的建议。