Solving for adversarial examples with projected gradient descent has been demonstrated to be highly effective in fooling the neural network based classifiers. However, in the black-box setting, the attacker is limited only to the query access to the network and solving for a successful adversarial example becomes much more difficult. To this end, recent methods aim at estimating the true gradient signal based on the input queries but at the cost of excessive queries. We propose an efficient discrete surrogate to the optimization problem which does not require estimating the gradient and consequently becomes free of the first order update hyperparameters to tune. Our experiments on Cifar-10 and ImageNet show the state of the art black-box attack performance with significant reduction in the required queries compared to a number of recently proposed methods. The source code is available at https://github.com/snu-mllab/parsimonious-blackbox-attack.
翻译:解决预测梯度下降的对抗性实例,已证明在愚弄神经网络分类器方面非常有效,但在黑箱设置中,攻击者仅限于查询网络,而解决成功的对抗性实例则更加困难。为此,最近采用的方法旨在根据输入查询来估计真正的梯度信号,但以过多查询为代价。我们建议对优化问题采用高效的离散替代方法,不需要估计梯度,因此无需按顺序更新超参数以调音速。我们在Cifar-10和图像网的实验显示,与最近提议的一些方法相比,所需的查询数量大大减少了所需的黑盒攻击性能。源代码见https://github.com/snu-mllab/parsimonious-blackbox-action。