Explaining neural network models is a challenging task that remains unsolved in its entirety to this day. This is especially true for high dimensional and complex data. With the present work, we introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view. Both provide novel analysis methods to enable a human AI analyst to grasp deeper insights into the knowledge that is captured by the neurons of a network. We test the conceptual expressivity of our novel views through different experiments on the ImageNet and Fruit-360 data sets. Furthermore, we show to which extent the views allow to quantify the conceptual similarity of different learning architectures. Finally, we demonstrate how conceptual views can be applied for abductive learning of human comprehensible rules from neurons. In summary, with our work, we contribute to the most relevant task of globally explaining neural networks models.
翻译:解释神经网络模型是一项挑战性的任务,至今仍未完全解决。对于高维和复杂数据来说尤其如此。在目前的工作中,我们引入了神经网络概念观点的两个概念,具体地说,是一种多价值和象征性的观点。两者都提供了新的分析方法,使人类人工智能分析师能够更深入地了解网络神经元所捕捉的知识。我们通过图像网络和水果360数据集的不同实验,测试我们新观点的概念表达性。此外,我们显示了这些观点在多大程度上可以量化不同学习结构的概念相似性。最后,我们展示了如何将概念观点应用于从神经元中诱拐学习人类可理解的规则。简言之,我们的工作是对全球解释神经网络模型的最相关任务作出贡献。