Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. To improve the transferability of adversarial examples on FR models, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of the adversarial examples to surrogate FR models by the adversarial strategy. Specifically, in the backpropagation step, BPFA records the gradients on pre-selected features and uses the gradient on the input image to craft adversarial perturbation to be added on the input image. In the next forward propagation step, BPFA leverages the recorded gradients to add perturbations(i.e., beneficial perturbations) that can be pitted against the adversarial perturbation added on the input image on their corresponding features. The above two steps are repeated until the last backpropagation step before the maximum number of iterations is reached. The optimization process of the adversarial perturbation added on the input image and the optimization process of the beneficial perturbations added on the features correspond to a minimax two-player game. Extensive experiments demonstrate that BPFA outperforms the state-of-the-art gradient-based adversarial attacks on FR.
翻译:面部识别模型很容易被对抗性例子所欺骗,这些例子的制作方法是在良性面容图像上添加不易察觉的扰动。为了改进FR模型上的对抗性例子的可转移性,我们提议了一种新型攻击方法,即“BPFA”,它减少了对立性例子的过度匹配,以通过对抗性战略来替代FR模型。具体地说,在后向调整步骤中,BPFA记录预选特征上的梯度,并利用输入图像上的斜度来制造投入图像上的对抗性扰动。在下一个前进的传播步骤中,BPFA利用记录下来的梯度来增加扰动性(即有益的扰动性攻击),这可以与对立性战略在输入模型上添加的对立性扰动。以上两个步骤重复到达到最大迭代数前的最后后的反向调整步骤。在投入性图中添加了对立性对立性扰动性扰动的梯度,在投入图像上添加了对准性对准性对准性图象,在双向的对面面面面面的对面的对面性变压性图像上展示。