Deep learning-based face recognition models are vulnerable to adversarial attacks. To curb these attacks, most defense methods aim to improve the robustness of recognition models against adversarial perturbations. However, the generalization capacities of these methods are quite limited. In practice, they are still vulnerable to unseen adversarial attacks. Deep learning models are fairly robust to general perturbations, such as Gaussian noises. A straightforward approach is to inactivate the adversarial perturbations so that they can be easily handled as general perturbations. In this paper, a plug-and-play adversarial defense method, named perturbation inactivation (PIN), is proposed to inactivate adversarial perturbations for adversarial defense. We discover that the perturbations in different subspaces have different influences on the recognition model. There should be a subspace, called the immune space, in which the perturbations have fewer adverse impacts on the recognition model than in other subspaces. Hence, our method estimates the immune space and inactivates the adversarial perturbations by restricting them to this subspace. The proposed method can be generalized to unseen adversarial perturbations since it does not rely on a specific kind of adversarial attack method. This approach not only outperforms several state-of-the-art adversarial defense methods but also demonstrates a superior generalization capacity through exhaustive experiments. Moreover, the proposed method can be successfully applied to four commercial APIs without additional training, indicating that it can be easily generalized to existing face recognition systems. The source code is available at https://github.com/RenMin1991/Perturbation-Inactivate
翻译:深度学习的面部识别模型很容易受到对抗性攻击。 为了遏制这些攻击, 多数防御方法的目的是提高识别模型对对抗性扰动的稳健性能。 但是, 这些方法的普遍化能力相当有限 。 实际上, 它们仍然容易受到隐蔽的对抗性攻击 。 深度学习模型对一般扰动具有相当强的强力性, 例如高萨噪音 。 一种直截了当的方法是, 使对立性扰动发生运动, 以便很容易被作为一般的对立性攻击处理 。 因此, 在本文中, 提议一种插接和播放的对立性对抗性防御方法, 命名为在对抗性攻击性攻击中( PIN ), 以停止对抗性对抗性攻击。 我们发现, 不同子空间的扰动作用对认知模型有不同的影响 。 应该有一个子空间, 叫做免疫空间, 扰动对认知模式的消极影响小于其他对正对立性空间。 因此, 我们的方法可以估计免疫空间和对抗性辩论性辩论性辩论性辩论性防御方法, 将它应用的反向反向反向反向攻击性攻击性攻击性攻击方法。 这个方法是自上的一种特定的。 拟议的方法, 。 拟议的方法是, 。 一种对准的对准方法, 一种特定的对准方法是, 一种特定的对准性研究法, 。 一种特定的对准一种特定的对准性研究法, 。 。 。 一种方法, 一种特定的对准性研究法, 一种特定的对准方法, 一种特定的对准方法, 。 。