Deep learning models have achieved state-of-the-art performance in many classification tasks. However, most of them cannot provide an interpretation for their classification results. Machine learning models that are interpretable are usually linear or piecewise linear and yield inferior performance. Non-linear models achieve much better classification performance, but it is hard to interpret their classification results. This may have been changed by an interpretable feedforward neural network (IFFNN) proposed that achieves both high classification performance and interpretability for malware detection. If the IFFNN can perform well in a more flexible and general form for other classification tasks while providing meaningful interpretations, it may be of great interest to the applied machine learning community. In this paper, we propose a way to generalize the interpretable feedforward neural network to multi-class classification scenarios and any type of feedforward neural networks, and evaluate its classification performance and interpretability on intrinsic interpretable datasets. We conclude by finding that the generalized IFFNNs achieve comparable classification performance to their normal feedforward neural network counterparts and provide meaningful interpretations. Thus, this kind of neural network architecture has great practical use.
翻译:深度学习模型在许多分类任务中取得了最先进的业绩。然而,它们大多无法为它们的分类结果提供解释。可解释的机械学习模型通常是线性或片性线性,并产生低性能。非线性模型的分类性能要好得多,但很难解释它们的分类结果。这个模型可能已经由一个可解释的进料神经网络(IFFNN)(IFFN) (IFFN) (IFFN) (IFFN) (IFFN) (IFFN) (IFFN) (IFFN) (I) (IFFN) (I) (IFFN) (I) (IFFN) (I) (IFFN) (I) (IFFN) (IF) (IFFN) (IF) (IFFN) (I) (IFFN) (I) (I) (IFFN) (I) (I) (IFFN) (IFI) (I) (I) (I) (IFI) (I) (IL) (IL) (IG) (IG) (I) (I) (I) (IL) (IL) (IL) (IL) (IL) (IL) (IL) (P) (I) (I) (I) (ID) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (IG) (P) (P) (P) (I) (P) (P) (P) (P) (P) (IG) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (P) (I) (I) (I) (I) (I) (I) (I) (I) (I) (I) (I) (I) (I) (P) (I) (P) (I) (I) (I) (I) (IG) (P) (