Artificial intelligence (AI) has huge potential to improve the health and well-being of people, but adoption in clinical practice is still limited. Lack of transparency is identified as one of the main barriers to implementation, as clinicians should be confident the AI system can be trusted. Explainable AI has the potential to overcome this issue and can be a step towards trustworthy AI. In this paper we review the recent literature to provide guidance to researchers and practitioners on the design of explainable AI systems for the health-care domain and contribute to formalization of the field of explainable AI. We argue the reason to demand explainability determines what should be explained as this determines the relative importance of the properties of explainability (i.e. interpretability and fidelity). Based on this, we propose a framework to guide the choice between classes of explainable AI methods (explainable modelling versus post-hoc explanation; model-based, attribution-based, or example-based explanations; global and local explanations). Furthermore, we find that quantitative evaluation metrics, which are important for objective standardized evaluation, are still lacking for some properties (e.g. clarity) and types of explanations (e.g. example-based methods). We conclude that explainable modelling can contribute to trustworthy AI, but the benefits of explainability still need to be proven in practice and complementary measures might be needed to create trustworthy AI in health care (e.g. reporting data quality, performing extensive (external) validation, and regulation).
翻译:人工智能(AI)在改善人们的健康和福祉方面具有巨大的潜力,但在临床实践中的采用仍然有限,缺乏透明度被确定为执行的主要障碍之一,因为临床医生应该相信人工智能系统是信任的。可解释的人工智能有潜力克服这一问题,并可以成为可信赖的人工智能的一个步骤。在本文件中,我们审查最近的文献,为研究人员和从业人员设计保健领域的可解释的人工智能系统提供指导,并有助于将可解释的人工智能领域正规化。我们争辩说,需求解释的理由决定了应该解释什么,因为这决定了可解释性(即解释性和忠诚性)的相对重要性。在此基础上,我们提出了一个框架,指导可解释性人工智能方法类别之间的选择(可解释性模型相对于事后解释性解释性解释;基于模型、基于归属或基于实例的解释性的解释;全球和地方解释性解释性解释性解释性解释性解释性解释性解释性)。此外,我们认为,对于客观的标准化评价性评价性(例如清晰性说明性说明性)和可解释性解释性解释性解释性解释性解释性措施的种类(我们解释性解释性解释性解释性解释性解释性解释性、性解释性解释性解释性解释性解释性解释性解释性报告性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性方法,可能)仍然有助于性分析性分析性分析性分析性分析性方法。