Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. To improve the transferability of adversarial face examples, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of adversarial examples to surrogate FR models by constantly generating new models that have the similar effect of hard samples to craft the adversarial examples. Specifically, in the backpropagation, BPFA records the gradients on pre-selected features and uses the gradient on the input image to craft the adversarial example. In the next forward propagation, BPFA leverages the recorded gradients to add perturbations (i.e., beneficial perturbations) that can be pitted against the adversarial example on their corresponding features. The optimization process of the adversarial example and the optimization process of the beneficial perturbations added on the features correspond to a minimax two-player game. Extensive experiments demonstrate that BPFA can significantly boost the transferability of adversarial attacks on FR.
翻译:面部识别(FFA)模型很容易被对抗性例子所欺骗,这些例子的制作方法是在良性面容图像上添加不易察觉的扰动。为了提高对抗性面容示例的可转移性,我们提议了一种新型攻击方法,称为“BPFA”,它通过不断生成具有类似硬样品效应的新模型来取代FR模型,从而减少对立性实例的过度匹配,从而替代FR模型。具体地说,在反向分析中,BPFA记录了预选特征上的梯度,并利用输入图像上的梯度来制作对抗性对立性实例。在下一个前期传播中,BPFA利用所记录的梯度来增加可与对立性攻击相对特征的触动性(即有益扰动性),从而可以与对抗性实例相对应的反动性示例相匹配。对抗性实例的优化过程和在特征上添加的有益扰动性过程与小型二人游戏相对应。广泛的实验表明,BPFFAA可以大大促进对立性对敌攻击的可转移性。