Explainable AI has emerged to be a key component for black-box machine learning approaches in domains with a high demand for reliability or transparency. Examples are medical assistant systems, and applications concerned with the General Data Protection Regulation of the European Union, which features transparency as a cornerstone. Such demands require the ability to audit the rationale behind a classifier's decision. While visualizations are the de facto standard of explanations, they come short in terms of expressiveness in many ways: They cannot distinguish between different attribute manifestations of visual features (e.g. eye open vs. closed), and they cannot accurately describe the influence of absence of, and relations between features. An alternative would be more expressive symbolic surrogate models. However, these require symbolic inputs, which are not readily available in most computer vision tasks. In this paper we investigate how to overcome this: We use inherent features learned by the network to build a global, expressive, verbal explanation of the rationale of a feed-forward convolutional deep neural network (DNN). The semantics of the features are mined by a concept analysis approach trained on a set of human understandable visual concepts. The explanation is found by an Inductive Logic Programming (ILP) method and presented as first-order rules. We show that our explanation is faithful to the original black-box model. The code for our experiments is available at https://github.com/mc-lovin-mlem/concept-embeddings-and-ilp/tree/ki2020.
翻译:在对可靠性或透明度的要求很高的领域,人们发现大赦国际是黑箱机器学习方法的一个关键组成部分。例如医疗助理系统和与欧洲联盟数据保护总条例有关的应用,后者以透明度为基石。这种要求要求要求有能力审计分类者决定背后的理由。虽然可视化是事实上的解释标准,但在许多方面,它们从表达的角度来说是简短的:它们无法区分视觉特征的不同属性表现(例如,眼睁对闭),它们无法准确描述缺乏特征和特征之间关系的影响。替代方案将更具有清晰的象征性代孕模型。然而,这些要求需要象征性的投入,而大多数计算机的视觉任务都不容易获得这种投入。我们研究如何克服这一点:我们使用网络所学的内在特征来构建一个全球的、清晰的口头解释,但从许多方面看,它们从外向前的深层神经网络(DNNN)的逻辑解释是简短的。这些特征的语义学特征是通过一套人类可理解的黑色视觉概念来进行概念分析的。我们用原始的逻辑规则来解释。我们用原始的逻辑学方法来解释。