可解释性载于《眼神之心:建立可解释的人工情报基础》 (Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence)

Explainable artificial intelligence and interpretable machine learning are research domains growing in importance. Yet, the underlying concepts remain somewhat elusive and lack generally agreed definitions. While recent inspiration from social sciences has refocused the work on needs and expectations of human recipients, the field still misses a concrete conceptualisation. We take steps towards addressing this challenge by reviewing the philosophical and social foundations of human explainability, which we then translate into the technological realm. In particular, we scrutinise the notion of algorithmic black boxes and the spectrum of understanding determined by explanatory processes and explainees' background knowledge. This approach allows us to define explainability as (logical) reasoning applied to transparent insights (into, possibly black-box, predictive systems) interpreted under background knowledge and placed within a specific context -- a process that engenders understanding in a selected group of explainees. We then employ this conceptualisation to revisit strategies for evaluating explainability as well as the much disputed trade-off between transparency and predictive power, including its implications for ante-hoc and post-hoc techniques along with fairness and accountability established by explainability. We furthermore discuss components of the machine learning workflow that may be in need of interpretability, building on a range of ideas from human-centred explainability, with a particular focus on explainees, contrastive statements and explanatory processes. Our discussion reconciles and complements current research to help better navigate open questions -- rather than attempting to address any individual issue -- thus laying a solid foundation for a grounded discussion and future progress of explainable artificial intelligence and interpretable machine learning.

翻译：可解释的人工智能和可解释的机器学习是越来越重要的研究领域。然而,基本概念仍然有些难以捉摸,缺乏普遍接受的定义。虽然社会科学的最近灵感将工作的重点重新放在人类接受者的需要和期望上,但该领域仍然缺乏具体的概念化。我们通过审查人类解释的哲学和社会基础,为迎接这一挑战采取了步骤,我们随后将其转化为技术领域。特别是,我们仔细研究算法黑盒的概念以及由解释过程和解释者背景知识所决定的理解范围。这一方法使我们能够将解释性定义为(逻辑)推理用于透明洞察(可能黑箱、预测系统),在背景知识中加以解释,并置于特定背景下 -- -- 这一过程促使特定解释者群体理解这一挑战。然后,我们利用这种概念化来重新审视评估解释性的战略,以及透明度和预测力之间的争议性交易,包括它对可解释性处理问题和可解释性后技术的影响,以及通过解释性解释性确定公平性和问责制。我们进一步讨论了机器学习工作流程的构成部分,这些部分可能比解释性解释性解释性研究更明确,从而解释我们更准确地解释性地解释我们的解释性研究,进而解释各种解释性研究,进而解释各种解释性解释性研究,从而解释性解释性研究范围,从而解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性研究。