In the last decade, deep neural networks have proven to be very powerful in computer vision tasks, starting a revolution in the computer vision and machine learning fields. However, deep neural networks, usually, are not robust to perturbations of the input data. In fact, several studies showed that slightly changing the content of the images can cause a dramatic decrease in the accuracy of the attacked neural network. Several methods able to generate adversarial samples make use of gradients, which usually are not available to an attacker in real-world scenarios. As opposed to this class of attacks, another class of adversarial attacks, called black-box adversarial attacks, emerged, which does not make use of information on the gradients, being more suitable for real-world attack scenarios. In this work, we compare three well-known evolution strategies on the generation of black-box adversarial attacks for image classification tasks. While our results show that the attacked neural networks can be, in most cases, easily fooled by all the algorithms under comparison, they also show that some black-box optimization algorithms may be better in "harder" setups, both in terms of attack success rate and efficiency (i.e., number of queries).
翻译:过去十年来,深心神经网络在计算机视觉任务中被证明非常强大,在计算机视觉和机器学习领域开始了一场革命。然而,深心神经网络通常并不强大,无法干扰输入数据。事实上,一些研究显示,图像内容稍有变化,可能导致被攻击神经网络的准确性急剧下降。能够生成对立样本的几种方法利用了梯度,在现实世界情景中,攻击者通常得不到这些梯度。与这种攻击相比,又出现了另一类称为黑盒对抗攻击的对抗性攻击,这种攻击不使用梯度信息,更适合真实世界攻击情景。在这项工作中,我们比较了生成黑盒对立攻击图像分类任务的三个众所周知的进化战略。虽然我们的结果显示,被攻击的神经网络在多数情况下很容易被正在比较的所有算法所欺骗。它们还表明,在攻击成功率和效率查询中,一些“更硬”设置的黑箱优化算法可能更好(i.e.e.)。