Interpretability has become an essential topic for artificial intelligence in some high-risk domains such as healthcare, bank and security. For commonly-used tabular data, traditional methods trained end-to-end machine learning models with numerical and categorical data only, and did not leverage human understandable knowledge such as data descriptions. Yet mining human-level knowledge from tabular data and using it for prediction remain a challenge. Therefore, we propose a concept and argumentation based model (CAM) that includes the following two components: a novel concept mining method to obtain human understandable concepts and their relations from both descriptions of features and the underlying data, and a quantitative argumentation-based method to do knowledge representation and reasoning. As a result of it, CAM provides decisions that are based on human-level knowledge and the reasoning process is intrinsically interpretable. Finally, to visualize the purposed interpretable model, we provide a dialogical explanation that contain dominated reasoning path within CAM. Experimental results on both open source benchmark dataset and real-word business dataset show that (1) CAM is transparent and interpretable, and the knowledge inside the CAM is coherent with human understanding; (2) Our interpretable approach can reach competitive results comparing with other state-of-art models.
翻译:对于常用的表格数据,传统方法培训端到端机学习模式,只提供数字和绝对数据,没有利用数据描述等人类可理解的知识;然而,从表格数据中挖掘人的知识并将其用于预测仍然是一个挑战;因此,我们提出了一个基于概念和论证的模式(CAM),其中包括以下两个组成部分:从描述特征和基本数据中获取人类可理解的概念及其关系的新概念采矿方法,以及基于定量论证的方法,以进行知识的表述和推理;因此,CAM提供基于人类层面知识和推理过程的决定,其基础是人类层面的知识,其推理过程是内在可解释的;最后,为了直观地看待目的可解释的模式,我们提供了包含CAM内主要推理路径的对话性解释。关于开放源基准数据集和实言商业数据集的实验结果显示:(1) CAM具有透明度和可解释性,而CAM内部的知识与人类理解是一致的;(2) 我们可解释的方法可以与其他状态模型进行竞争性比较。