There are several effective methods in explaining the inner workings of convolutional neural networks (CNNs). However, in general, finding the inverse of the function performed by CNNs as a whole is an ill-posed problem. In this paper, we propose a method based on adjoint operators to reconstruct, given an arbitrary unit in the CNN (except for the first convolutional layer), its effective hypersurface in the input space that replicates that unit's decision surface conditioned on a particular input image. Our results show that the hypersurface reconstructed this way, when multiplied by the original input image, would give nearly the exact output value of that unit. We find that the CNN unit's decision surface is largely conditioned on the input, and this may explain why adversarial inputs can effectively deceive CNNs.
翻译:解释神经神经网络内部运行情况有几种有效的方法。 但是,一般而言,发现CNN整个功能的反向是一个错误的问题。 在本文中,我们提出一种基于联合操作者的方法,以重建CNN的一个任意单元(除了第一个革命层之外),它的有效超表层在复制该单元决定表面的输入空间中的有效超表层以特定输入图像为条件。我们的结果表明,如果将原输入图像乘以原始输入图像,超表层的重建将给该单元带来几乎准确的输出值。我们发现CNN单位的决定表面基本上以输入为条件,这可以解释对抗性投入能够有效地欺骗CNN的理由。