Deep learning has been a popular topic and has achieved success in many areas. It has drawn the attention of researchers and machine learning practitioners alike, with developed models deployed to a variety of settings. Along with its achievements, research has shown that deep learning models are vulnerable to adversarial attacks. This finding brought about a new direction in research, whereby algorithms were developed to attack and defend vulnerable networks. Our interest is in understanding how these attacks effect change on the intermediate representations of deep learning models. We present a method for measuring and analyzing the deviations in representations induced by adversarial attacks, progressively across a selected set of layers. Experiments are conducted using an assortment of attack algorithms, on the CIFAR-10 dataset, with plots created to visualize the impact of adversarial attacks across different layers in a network.
翻译:深层学习是一个广受欢迎的主题,在许多领域取得了成功,吸引了研究人员和机器学习工作者的注意,在各种环境中都采用了先进的模型;除了其成就外,研究还表明深层学习模式容易受到对抗性攻击;这一发现为研究带来了新的方向,据此发展了算法来攻击和维护脆弱的网络;我们感兴趣的是了解这些攻击如何影响深层学习模式的中间表现方式的变化;我们提出了一个方法,用来衡量和分析对抗性攻击引发的描述偏差,并逐步跨越一系列选定的层次;在CIFAR-10数据集上利用各种攻击性算法进行实验,并绘制图图图,将对抗性攻击在网络中的不同层次上的影响直观化。