Understanding the inner workings of deep neural networks (DNNs) is essential to provide trustworthy artificial intelligence techniques for practical applications. Existing studies typically involve linking semantic concepts to units or layers of DNNs, but fail to explain the inference process. In this paper, we introduce neural architecture disentanglement (NAD) to fill the gap. Specifically, NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. We investigate whether, where, and how the disentanglement occurs through experiments conducted with handcrafted and automatically-searched network architectures, on both object-based and scene-based datasets. Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs. First, DNNs can be divided into sub-architectures for independent tasks. Second, deeper layers do not always correspond to higher semantics. Third, the connection type in a DNN affects how the information flows across layers, leading to different disentanglement behaviors. With NAD, we further explain why DNNs sometimes give wrong predictions. Experimental results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones. Code will be available at: https://github.com/hujiecpp/NAD.
翻译:了解深心神经网络的内部运行( DNN) 是提供可靠人工智能技术以实际应用的关键。 现有的研究通常涉及将语义概念与 DNN 的单位或层次联系起来, 但无法解释推断过程。 在本文中, 我们引入神经结构解析( NAD) 以填补空白。 具体地说, NAD 学会根据独立任务将预先训练的 DNN 分解为子结构, 形成描述推断过程的信息流。 我们调查 DNN 是否、 在哪里以及如何通过手动和自动搜索的网络结构, 将语义概念与 DNNNN 的单位或层次联系起来, 但却没有解释。 根据实验结果, 我们提出了三项新发现, 为 DNNNS 内部逻辑提供了新的见解。 首先, DNNNS 可以按照独立任务的子结构划分为子结构。 第二, 更深层次并不总与更高的语义对应。 第三, DNNNN 的连接类型会影响信息如何跨层流动, 导致不同的对 DNND 的高级预测结果进行不同的解析错判。 NAD 。 将显示错误的轨道。