Deep learning techniques are increasingly being adopted for classification tasks over the past decade, yet explaining how deep learning architectures can achieve state-of-the-art performance is still an elusive goal. While all the training information is embedded deeply in a trained model, we still do not understand much about its performance by only analyzing the model. This paper examines the neuron activation patterns of deep learning-based classification models and explores whether the models' performances can be explained through neurons' activation behavior. We propose two approaches: one that models neurons' activation behavior as a graph and examines whether the neurons form meaningful communities, and the other examines the predictability of neurons' behavior using entropy. Our comprehensive experimental study reveals that both the community quality (modularity) and entropy are closely related to the deep learning models' performances, thus paves a novel way of explaining deep learning models directly from the neurons' activation pattern.
翻译:在过去十年里,人们越来越多地采用深层次学习技术来进行分类任务,但解释深层次学习结构如何能达到最先进的业绩仍然是一个难以实现的目标。虽然所有培训信息都深入地嵌入一个经过培训的模式,但我们仍然不太了解其业绩,只是通过分析模型。本文审视了深层次学习分类模式的神经活化模式模式,并探讨这些模型的性能是否可以通过神经活化行为来解释。我们提出了两种方法:一种是将神经元的活化行为模拟为图表,研究神经元是否形成有意义的群体,另一种是研究神经元行为是否使用昆虫的可预测性。我们的全面实验研究表明,社区质量(模式)和酶都与深层次学习模式的性能密切相关,因此为直接从神经元的活化模式解释深层次学习模式铺平了一条新路。