While several types of post hoc explanation methods (e.g., feature attribution methods) have been proposed in recent literature, there is little to no work on systematically benchmarking these methods in an efficient and transparent manner. Here, we introduce OpenXAI, a comprehensive and extensible open source framework for evaluating and benchmarking post hoc explanation methods. OpenXAI comprises of the following key components: (i) a flexible synthetic data generator and a collection of diverse real-world datasets, pre-trained models, and state-of-the-art feature attribution methods, (ii) open-source implementations of twenty-two quantitative metrics for evaluating faithfulness, stability (robustness), and fairness of explanation methods, and (iii) the first ever public XAI leaderboards to benchmark explanations. OpenXAI is easily extensible, as users can readily evaluate custom explanation methods and incorporate them into our leaderboards. Overall, OpenXAI provides an automated end-to-end pipeline that not only simplifies and standardizes the evaluation of post hoc explanation methods, but also promotes transparency and reproducibility in benchmarking these methods. OpenXAI datasets and data loaders, implementations of state-of-the-art explanation methods and evaluation metrics, as well as leaderboards are publicly available at https://open-xai.github.io/.
翻译:虽然最近文献中提出了几种类型的事后临时解释方法(例如,特征归属方法),但在以高效和透明的方式系统地确定这些方法的基准方面几乎没有任何工作,这里我们介绍OpenXAI,这是用于评估和基准设定临时解释方法的全面和可扩展的开放源框架;OpenXAI由以下关键组成部分组成:(一) 灵活的合成数据生成器,收集各种真实世界数据集、经过预先培训的模式和最新特征归属方法;(二) 公开来源实施22项量化指标,用于评估忠诚、稳定(腐败)和解释方法的公平性,以及(三) 有史以来第一个公开的XAI领导板,用于基准解释。 OpenXAI很容易推广,因为用户可以随时评估定制解释方法,并将其纳入我们的领导板。总体而言,OpenXAI提供自动端对终端的管道,不仅简化和标准化了对事后解释方法的评价,而且还促进这些方法的透明度和可重新确定基准。 OpenXAI 数据设置和作为公开数据库实施方式和标准。