With the recent success of deep neural networks in computer vision, it is important to understand the internal working of these networks. What does a given neuron represent? The concepts captured by a neuron may be hard to understand or express in simple terms. The approach we propose in this paper is to characterize the region of input space that excites a given neuron to a certain level; we call this the inverse set. This inverse set is a complicated high dimensional object that we explore by an optimization-based sampling approach. Inspection of samples of this set by a human can reveal regularities that help to understand the neuron. This goes beyond approaches which were limited to finding an image which maximally activates the neuron or using Markov chain Monte Carlo to sample images, but this is very slow, generates samples with little diversity and lacks control over the activation value of the generated samples. Our approach also allows us to explore the intersection of inverse sets of several neurons and other variations.
翻译:随着计算机视觉中深神经网络最近的成功,理解这些网络的内部运作非常重要。 某个神经元代表什么? 神经元所捕捉的概念可能很难理解或以简单的措辞表达。 我们在本文件中建议的方法是将输入空间的区域定性为刺激给定神经元到一定水平; 我们称之为反向组。 这个反向组是一个复杂的高维天体, 我们通过基于优化的采样方法来探索。 由人类对这个组的样本进行检查, 能够揭示有助于理解神经元的规律性。 这不仅仅局限于找到一个图像, 以最大程度地激活神经元, 或使用Markov 链 Monte Carlo 来取样图像, 但是这个方法非常缓慢, 生成的样本很少多样性, 并且对生成的样本的激活价值缺乏控制。 我们的方法还使我们能够探索几个神经元的反向组合和其他变异的交点。