The reasonable definition of semantic interpretability presents the core challenge in explainable AI. This paper proposes a method to modify a traditional convolutional neural network (CNN) into an interpretable compositional CNN, in order to learn filters that encode meaningful visual patterns in intermediate convolutional layers. In a compositional CNN, each filter is supposed to consistently represent a specific compositional object part or image region with a clear meaning. The compositional CNN learns from image labels for classification without any annotations of parts or regions for supervision. Our method can be broadly applied to different types of CNNs. Experiments have demonstrated the effectiveness of our method.
翻译:对语义解释的合理定义在可解释的AI中提出了核心挑战。本文件提出一种方法,将传统的进化神经网络(CNN)修改为可解释的组成有线电视新闻网(CNN),以便学习将中进层有意义的视觉模式编码的过滤器。在成文有线电视新闻网(CNN)中,每个过滤器应始终代表特定的组成物体部分或具有明确含义的图像区域。成文有线电视新闻网(CNN)从图像标签中学习分类,而没有任何部件或区域的说明供监督。我们的方法可以广泛适用于不同类型的有线电视新闻网(CNN),实验证明了我们的方法的有效性。