Numerous government initiatives (e.g. the EU with GDPR) are coming to the conclusion that the increasing complexity of modern software systems must be contrasted with some Rights to Explanation and metrics for the Impact Assessment of these tools, that allow humans to understand and oversee the output of Automated Decision Making systems. Explainable AI was born as a pathway to allow humans to explore and understand the inner working of complex systems. But establishing what is an explanation and objectively evaluating explainability, are not trivial tasks. With this paper, we present a new model-agnostic metric to measure the Degree of eXplainability of (correct) information in an objective way, exploiting a specific theoretical model from Ordinary Language Philosophy called the Achinstein's Theory of Explanations, implemented with an algorithm relying on deep language models for knowledge graph extraction and information retrieval. In order to understand whether this metric is actually behaving as explainability is expected to, we have devised a few experiments and user-studies involving more than 160 participants evaluating two realistic AI-based systems for healthcare and finance using famous AI technology including Artificial Neural Networks and TreeSHAP. The results we obtained are very encouraging, suggesting that our proposed metric for measuring the Degree of eXplainability is robust on several scenarios and it can be eventually exploited for a lawful Impact Assessment of an Automated Decision Making system.
翻译:许多政府倡议(例如欧盟与GDPR)正在得出结论,现代软件系统日益复杂,必须将其与这些工具的某些解释权和影响评估衡量标准形成对比,使人类能够理解和监督自动决策系统的产出。可以解释的AI是人类探索和理解复杂系统内部运行的途径。但是,确定什么是解释和客观评估的解释性任务并不是微不足道的任务。通过本文件,我们提出了一个新的模型 -- -- 不可否认的衡量标准,以客观的方式衡量(纠正)信息的易异性度,利用普通语言哲学中称为 " Achinstein解释理论 " 的具体理论模型,该模型采用一种算法,依靠深语言模型进行知识图形提取和信息检索。为了了解这一指标是否真正具有解释性,我们设计了一些实验和用户研究,涉及160多名参与者,他们利用包括人工神经网络和树树本系统在内的著名AI技术,对保健和金融两个现实性系统进行了评价,利用一种名为 " 解释性解释性解释性解释性解释性解释性解释性理论 " 和解释性理论性理论性理论性理论性模型来测量这些系统。我们最终可以利用一种可衡量的模型和可测量性模型。我们获得的结果可以用来测定性地测定性地测量。