While the emerging research field of explainable artificial intelligence (XAI) claims to address the lack of explainability in high-performance machine learning models, in practice, XAI targets developers rather than actual end-users. Unsurprisingly, end-users are often unwilling to use XAI-based decision support systems. Similarly, there is limited interdisciplinary research on end-users' behavior during XAI explanations usage, rendering it unknown how explanations may impact cognitive load and further affect end-user performance. Therefore, we conducted an empirical study with 271 prospective physicians, measuring their cognitive load, task performance, and task time for distinct implementation-independent XAI explanation types using a COVID-19 use case. We found that these explanation types strongly influence end-users' cognitive load, task performance, and task time. Further, we contextualized a mental efficiency metric, ranking local XAI explanation types best, to provide recommendations for future applications and implications for sociotechnical XAI research.
翻译:随着可解释性人工智能 (eXplainable Artificial Intelligence, XAI) 领域不断发展壮大,越来越高效的机器学习模型缺乏可解释性的问题得到了一定程度的缓解。在实践中,可解释性人工智能更倾向于面向开发者而非最终用户。这导致最终用户通常不愿使用基于可解释性人工智能的决策支持系统。同时,关于最终用户在使用可解释性人工智能解释时的行为,跨学科研究还比较有限。因此,本文在 COVID-19应用场景下,通过对 271 名未来医生开展实证研究,测量了不同实现独立的可解释性人工智能解释类型对最终用户认知负荷、任务表现和任务时间的影响。研究结果表明,这些解释类型极大地影响了最终用户的认知负荷、任务表现和任务时间。同时,我们还对一个名为“心理效率”的指标进行了上下文解释,并为未来的应用提供了建议,对社会技术可解释性人工智能研究的应用和意义进行了分析。