We propose a novel explanation method that explains the decisions of a deep neural network by investigating how the intermediate representations at each layer of the deep network were refined during the training process. This way we can a) find the most influential training examples during training and b) analyze which classes attributed most to the final representation. Our method is general: it can be wrapped around any iterative optimization procedure and covers a variety of neural network architectures, including feed-forward networks and convolutional neural networks. We first propose a method for stochastic training with single training instances, but continue to also derive a variant for the common mini-batch training. In experimental evaluations, we show that our method identifies highly representative training instances that can be used as an explanation. Additionally, we propose a visualization that provides explanations in the form of aggregated statistics over the whole training process.
翻译:我们提出了一个解释深神经网络决定的新解释方法,通过调查深心网络每一层的中间代表机构在培训过程中是如何改进的。这样,我们就可以在培训过程中找到最有影响力的培训实例,并且(b)分析哪些班级最适合最后代表机构。我们的方法很笼统:它可以围绕任何迭代优化程序,涵盖各种神经网络结构,包括进料转发网络和连锁神经网络。我们首先提出一种方法,用单一的培训实例进行随机培训,但同时也继续为共同的微型批量培训提出一个变体。我们在实验性评估中显示,我们的方法确定了高度代表性的培训案例,可以用作解释。此外,我们建议一种直观化,以综合统计的形式解释整个培训过程。