Deep neural network-based image classification can be misled by adversarial examples with small and quasi-imperceptible perturbations. Furthermore, the adversarial examples created on one classification model can also fool another different model. The transferability of the adversarial examples has recently attracted a growing interest since it makes black-box attacks on classification models feasible. As an extension of classification, semantic segmentation has also received much attention towards its adversarial robustness. However, the transferability of adversarial examples on segmentation models has not been systematically studied. In this work, we intensively study this topic. First, we explore the overfitting phenomenon of adversarial examples on classification and segmentation models. In contrast to the observation made on classification models that the transferability is limited by overfitting to the source model, we find that the adversarial examples on segmentations do not always overfit the source models. Even when no overfitting is presented, the transferability of adversarial examples is limited. We attribute the limitation to the architectural traits of segmentation models, i.e., multi-scale object recognition. Then, we propose a simple and effective method, dubbed dynamic scaling, to overcome the limitation. The high transferability achieved by our method shows that, in contrast to the observations in previous work, adversarial examples on a segmentation model can be easy to transfer to other segmentation models. Our analysis and proposals are supported by extensive experiments.
翻译:以深神经网络为基础的图像分类可能会被小的和几乎无法察觉的干扰的对抗性实例误导。此外,在一种分类模式上建立的对抗性实例也可能愚弄另一种不同的模式。最近,由于对分类模式的黑箱攻击是可行的,对抗性实例的可转让性引起了越来越多的兴趣。作为分类的延伸,语义分化也得到了很大的注意,但是,关于分化模式的对抗性实例的可转让性没有得到系统的研究。在这项工作中,我们加紧研究这个专题。首先,我们探讨在分类和分化模式上过于适合的对抗性实例的现象。与关于分类模式的可转让性因过分适应源模式而受到限制的观察相反,我们发现关于分化的对抗性实例并不总是过分适合源模式。即使没有提出过分的强力,对抗性实例的可转让性也是有限的。我们把分化模式的建筑特征,即多尺度对象识别。然后,我们提出一种简单有效的模式,即对分类和分化模型的可转让性现象进行过度的比较性分析,通过以往的灵活性分析,可以显示我们所实现的反向分段段式分析的方法。