Deep Neural Networks (DNNs) are central to deep learning, and understanding their internal working mechanism is crucial if they are to be used for emerging applications in medical and industrial AI. To this end, the current line of research typically involves linking semantic concepts to a DNN's units or layers. However, this fails to capture the hierarchical inference procedure throughout the network. To address this issue, we introduce the novel concept of Neural Architecture Disentanglement (NAD) in this paper. Specifically, we disentangle a pre-trained network into hierarchical paths corresponding to specific concepts, forming the concept feature paths, i.e., the concept flows from the bottom to top layers of a DNN. Such paths further enable us to quantify the interpretability of DNNs according to the learned diversity of human concepts. We select four types of representative architectures ranging from handcrafted to autoML-based, and conduct extensive experiments on object-based and scene-based datasets. Our NAD sheds important light on the information flow of semantic concepts in DNNs, and provides a fundamental metric that will facilitate the design of interpretable network architectures. Code will be available at: https://github.com/hujiecpp/NAD.
翻译:深心神经网络(DNNs)是深层学习的核心,了解其内部工作机制至关重要。为此,目前的研究线通常涉及将语义概念与DNN的单位或层次联系起来。然而,这未能在整个网络中捕捉等级推论程序。为解决这一问题,我们在本文件中引入神经结构分解的新概念。具体地说,我们将预先训练的网络分解为与具体概念相对应的等级路径,形成概念特征路径,即概念从DNN的底层到顶层。这种路径使我们能够根据人类概念的已知多样性量化DNNs的可解释性。我们选择了四类代表性结构,从手制到自动ML,对基于对象的和基于现场的数据集进行广泛的实验。我们的NAD为DNes的语义概念信息流提供了重要的亮光,并提供了一个基本衡量标准,将便利设计可解释的网络架构。代码将设在DNNW/MAC/D。