The black-box adversarial attack has attracted impressive attention for its practical use in the field of deep learning security, meanwhile, it is very challenging as there is no access to the network architecture or internal weights of the target model. Based on the hypothesis that if an example remains adversarial for multiple models, then it is more likely to transfer the attack capability to other models, the ensemble-based adversarial attack methods are efficient and widely used for black-box attacks. However, ways of ensemble attack are rather less investigated, and existing ensemble attacks simply fuse the outputs of all the models evenly. In this work, we treat the iterative ensemble attack as a stochastic gradient descent optimization process, in which the variance of the gradients on different models may lead to poor local optima. To this end, we propose a novel attack method called the stochastic variance reduced ensemble (SVRE) attack, which could reduce the gradient variance of the ensemble models and take full advantage of the ensemble attack. Empirical results on the standard ImageNet dataset demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly.
翻译:黑箱对抗性攻击因其在深层学习安全领域的实际用途而引起了令人印象深刻的关注,同时,由于无法进入网络结构或目标模型的内部重量,它也非常具有挑战性。基于以下假设,即如果一个例子仍然对多种模型持对抗态度,那么更有可能将攻击能力转移到其他模型,基于共同的对抗性攻击方法是高效的,并被广泛用于黑箱攻击。然而,联合攻击的方法调查得较少,而现有的共同攻击只是平衡地融合了所有模型的输出。在这项工作中,我们把迭代混合攻击视为一种随机性梯度梯度下层优化进程,其中不同模型的梯度差异可能导致当地偏差。为此,我们提出一种新的攻击方法,称作减少共振性差异(SVRE)攻击,这可以减少共同攻击模型的梯度差异,并充分利用共同攻击。标准图像网络数据集的磁性结果显示,不同模型的梯度差异可能导致当地偏差。为此,拟议的方法可以大大推进现有的对抗性攻击形式。