As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions. Researchers have responded to this need with explainable AI (XAI), but often proclaim interpretability axiomatically without evaluation. When these systems are evaluated, they are often tested through offline simulations with proxy metrics of interpretability (such as model complexity). We empirically evaluate the veracity of three common interpretability assumptions through a large scale human-subjects experiment with a simple "placebo explanation" control. We find that feature attribution explanations provide marginal utility in our task for a human decision maker and in certain cases result in worse decisions due to cognitive and contextual confounders. This result challenges the assumed universal benefit of applying these methods and we hope this work will underscore the importance of human evaluation in XAI research. Supplemental materials -- including anonymized data from the experiment, code to replicate the study, an interactive demo of the experiment, and the models used in the analysis -- can be found at: https://doi.pizza/challenging-xai.
翻译:随着机器学习和算法决策系统越来越多地在高空载人环绕环境中被利用,迫切需要了解其预测的理由。研究人员以可解释的AI(XAI)回应了这一需要,但往往不作评估而宣布可解释性。当这些系统得到评估时,它们往往通过离线模拟和代用解释性指标(如模型复杂性)进行测试。我们通过一个简单的“更替解释”控制大规模人体实验,对三种通用解释假设的真实性进行了实证评估。我们发现,特性归属解释在人类决策者的任务中提供了边际的效用,在某些情况下导致认知和背景混杂者做出更糟糕的决定。这导致假设应用这些方法的普遍好处受到质疑,我们希望这项工作将强调人类评价在XAI研究中的重要性。补充材料 -- -- 包括实验的匿名数据、复制研究的代码、实验的互动演示以及分析中使用的模型 -- -- 可以在以下网址找到:https://doi.pizza/challing-xai。