Adversarial patch is an important form of real-world adversarial attack that brings serious risks to the robustness of deep neural networks. Previous methods generate adversarial patches by either optimizing their perturbation values while fixing the pasting position or manipulating the position while fixing the patch's content. This reveals that the positions and perturbations are both important to the adversarial attack. For that, in this paper, we propose a novel method to simultaneously optimize the position and perturbation for an adversarial patch, and thus obtain a high attack success rate in the black-box setting. Technically, we regard the patch's position, the pre-designed hyper-parameters to determine the patch's perturbations as the variables, and utilize the reinforcement learning framework to simultaneously solve for the optimal solution based on the rewards obtained from the target model with a small number of queries. Extensive experiments are conducted on the Face Recognition (FR) task, and results on four representative FR models show that our method can significantly improve the attack success rate and query efficiency. Besides, experiments on the commercial FR service and physical environments confirm its practical application value. We also extend our method to the traffic sign recognition task to verify its generalization ability.
翻译:Aversarial adversarial 补丁是真实世界对抗性攻击的一种重要形式,它给深神经网络的坚固性带来严重风险。 以往的方法通过在固定粘贴位置时优化其扰动值,或者在修补补补补补补补丁内容的同时操纵位置来产生对抗性补丁。 这表明位置和扰动对于对抗性攻击都很重要。 为此,我们在本文件中提出了一个新颖的方法,以同时优化对抗性补丁的位置和扰动,从而在黑盒设置中获得高攻击成功率。 从技术上讲,我们认为补丁的位置、预先设计的确定补丁扰动值的超参数是变量,并利用强化学习框架,同时根据从目标模型获得的奖赏和少量询问来找到最佳解决办法。 我们根据面辨识任务进行了广泛的实验,四个具有代表性的FR模型的结果表明,我们的方法可以大大提高攻击成功率和查询效率。 此外,对商业FR服务和物理环境的实验还证实了其实际应用能力。 我们还将方法扩展到确认其一般的交通价值。