Decision-based attacks (DBA), wherein attackers perturb inputs to spoof learning algorithms by observing solely the output labels, are a type of severe adversarial attacks against Deep Neural Networks (DNNs) requiring minimal knowledge of attackers. State-of-the-art DBA attacks relying on zeroth-order gradient estimation require an excessive number of queries. Recently, Bayesian optimization (BO) has shown promising in reducing the number of queries in score-based attacks (SBA), in which attackers need to observe real-valued probability scores as outputs. However, extending BO to the setting of DBA is nontrivial because in DBA only output labels instead of real-valued scores, as needed by BO, are available to attackers. In this paper, we close this gap by proposing an efficient DBA attack, namely BO-DBA. Different from existing approaches, BO-DBA generates adversarial examples by searching so-called \emph{directions of perturbations}. It then formulates the problem as a BO problem that minimizes the real-valued distortion of perturbations. With the optimized perturbation generation process, BO-DBA converges much faster than the state-of-the-art DBA techniques. Experimental results on pre-trained ImageNet classifiers show that BO-DBA converges within 200 queries while the state-of-the-art DBA techniques need over 15,000 queries to achieve the same level of perturbation distortion. BO-DBA also shows similar attack success rates even as compared to BO-based SBA attacks but with less distortion.
翻译:以决定为基础的攻击(DBA), 攻击者通过只观察输出标签来破坏对学习算法的输入,这种攻击是对深神经网络(DNN)的严重对抗性攻击,需要攻击者知之甚少。 依靠零级梯度估计的最先进的DBA攻击需要过多的查询。 最近, Bayesian 优化(BO) 显示有希望减少以分为基础的攻击(SBA) 中的查询次数,攻击者需要将实际价值的概率分数作为产出来观察。 但是,将BO 扩大到DBA 设置是没有意义的,因为DBA 中只有产出标签,而不是像BO所要求的实际价值得分。 在本文中,我们提出高效的DBA 攻击,即BO-DBA, 不同于现有的方法,BO-BA 优化的计算结果排序比BBBA 的排序要快得多。 然后,BBA 将BA-BA 的排序比BA 更接近的计算结果的计算方法显示BA-BA 更接近于BA 的升级的计算结果。