Current visualization based network interpretation methodssuffer from lacking semantic-level information. In this paper, we introduce the novel task of interpreting classification models using fine grained textual summarization. Along with the label prediction, the network will generate a sentence explaining its decision. Constructing a fully annotated dataset of filter|text pairs is unrealistic because of image to filter response function complexity. We instead propose a weakly-supervised learning algorithm leveraging off-the-shelf image caption annotations. Central to our algorithm is the filter-level attribute probability density function (PDF), learned as a conditional probability through Bayesian inference with the input image and its feature map as latent variables. We show our algorithm faithfully reflects the features learned by the model using rigorous applications like attribute based image retrieval and unsupervised text grounding. We further show that the textual summarization process can help in understanding network failure patterns and can provide clues for further improvements.
翻译:当前基于网络可视化的网络解释方法来自缺乏语义级信息。 在本文中, 我们引入了使用精细的微粒文本总和来解释分类模型的新任务。 与标签预测一起, 网络将生成一个句子来解释其决定 。 建立一个全加注的过滤- 文本配对数据集不切实际, 因为图像会影响过滤响应功能的复杂性 。 我们相反提出一个微弱监管的学习算法来利用现成图像说明。 我们算法的核心是过滤级属性概率密度函数( PDF ), 通过Bayesian 推断输入图像的有条件概率, 以及其特征映射作为潜在变量。 我们展示我们的算法忠实地反映了通过严格应用模型所学到的特征, 比如基于属性图像的检索和未监督的文本基础。 我们进一步显示文本加固化过程可以帮助理解网络故障模式, 并为进一步的改进提供线索 。