Deep Learning (DL) is being applied in various domains, especially in safety-critical applications such as autonomous driving. Consequently, it is of great significance to ensure the robustness of these methods and thus counteract uncertain behaviors caused by adversarial attacks. In this paper, we use gradient heatmaps to analyze the response characteristics of the VGG-16 model when the input images are mixed with adversarial noise and statistically similar Gaussian random noise. In particular, we compare the network response layer by layer to determine where errors occurred. Several interesting findings are derived. First, compared to Gaussian random noise, intentionally generated adversarial noise causes severe behavior deviation by distracting the area of concentration in the networks. Second, in many cases, adversarial examples only need to compromise a few intermediate blocks to mislead the final decision. Third, our experiments revealed that specific blocks are more vulnerable and easier to exploit by adversarial examples. Finally, we demonstrate that the layers $Block4\_conv1$ and $Block5\_cov1$ of the VGG-16 model are more susceptible to adversarial attacks. Our work could provide valuable insights into developing more reliable Deep Neural Network (DNN) models.
翻译:深学习( DL) 正在不同领域应用, 特别是在安全关键应用领域, 如自动驾驶等 。 因此, 必须确保这些方法的稳健性, 从而对抗对抗性攻击造成的不确定行为。 在本文中, 当输入图像与对抗性噪音和统计上类似的高斯随机噪音混杂在一起时, 我们使用梯度热映射分析 VGG-16 模型的反应特性。 特别是, 我们按层对网络反应层进行比较, 以确定出错的地点 。 一些有趣的发现 。 首先, 与高斯随机噪音相比, 故意产生的对抗性噪音通过转移网络的集中区而引起严重的行为偏差。 第二, 在许多情况下, 对抗性例子只需牺牲几个中间块即可误导最终决定 。 第三, 我们的实验显示, 特定的区块更容易被对抗性实例所利用 。 最后, 我们证明 VGG-16 模型的层 $Block4 conv1$ 和 $Block5<unk> cov1$ 。 我们的工作可以提供更有价值的洞察到更可靠的深神经网络模型。</s>