In this paper, we investigate the use of pretraining with adversarial networks, with the objective of discovering the relationship between network depth and robustness. For this purpose, we selectively retrain different portions of VGG and ResNet architectures on CIFAR-10, Imagenette, and ImageNet using non-adversarial and adversarial data. Experimental results show that susceptibility to adversarial samples is associated with low-level feature extraction layers. Therefore, retraining of high-level layers is insufficient for achieving robustness. Furthermore, adversarial attacks yield outputs from early layers that differ statistically from features for non-adversarial samples and do not permit consistent classification by subsequent layers. This supports common hypotheses regarding the association of robustness with the feature extractor, insufficiency of deeper layers in providing robustness, and large differences in adversarial and non-adversarial feature vectors.
翻译:在本文中,我们调查了使用对抗性网络进行预培训的情况,目的是发现网络深度和稳健度之间的关系,为此,我们利用非对抗性和对抗性数据,有选择地对CIFAR-10、imagenette和图像网络的VGG和ResNet结构的不同部分进行再培训;实验结果显示,对对抗性样本的易感性与低水平地物提取层有关;因此,对高层的再培训不足以实现稳健性;此外,对抗性攻击从早期的层层产生产出,在统计上不同于非对抗性样品的特征,不允许随后的层进行一致分类;这支持关于强性与地物提取器、图像网和图像网不同、更深层不足以提供稳健性和对抗性和非对抗性特征矢量的巨大差异的共同假设。