利用图象分类系统中层解释的自动编码器和分解方法 (Exploiting auto-encoders and segmentation methods for middle-level explanations of image classification systems)

A central issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence (XAI) is to provide methods to give explanations for the behaviours of Machine Learning (ML) non-interpretable models after the training. Recently, it is becoming more and more evident that new directions to create better explanations should take into account what a good explanation is to a human user. This paper suggests taking advantage of developing an XAI framework that allows producing multiple explanations for the response of image a classification system in terms of potentially different middle-level input features. To this end, we propose an XAI framework able to construct explanations in terms of input features extracted by auto-encoders. We start from the hypothesis that some autoencoders, relying on standard data representation approaches, could extract more salient and understandable input properties, which we call here \textit{Middle-Level input Features} (MLFs), for a user with respect to raw low-level features. Furthermore, extracting different types of MLFs through different type of autoencoders, different types of explanations for the same ML system behaviour can be returned. We experimentally tested our method on two different image datasets and using three different types of MLFs. The results are encouraging. Although our novel approach was tested in the context of image classification, it can potentially be used on other data types to the extent that auto-encoders to extract humanly understandable representations can be applied.

翻译：快速增长的可移植人工智能(XAI)研究领域所涉及的一个核心问题是提供方法,解释培训后机器学习(ML)非可解释模型的行为。最近,越来越明显的一点是,创造更好解释的新方向应该考虑到对人类用户的正确解释。本文件建议利用开发XAI框架,以便能够从潜在的不同中层输入特征的角度对一个分类系统的图像反应作出多种解释。为此,我们提议一个 XAI框架,能够从自动编码器提取的投入特征方面作出解释。我们从以下假设出发:一些自动编码器,依靠标准数据代表方法,可以提取更突出和易懂的投入特性,我们在这里呼吁的这些输入特性是针对一个人类用户的“中层输入特性”(MLFs),对于一个原始低层次特征的用户来说,我们用不同的低层输入系统特性来对图像作出多种解释。此外,通过不同类型的自动编码,对同一ML系统行为的不同类型解释可以进行不同类型解释。我们实验性地测试了三种不同类型数据分类方法,在不同的数字类型上可以返回。我们实验性地测试了我们不同类型中所使用的不同类型数据,在不同的数字类型中可以使用不同的图像分类方法。