Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting. Though several methods have demonstrated impressive transferability of untargeted adversarial examples, targeted adversarial transferability is still challenging. The existing methods either have low targeted transferability or sacrifice computational efficiency. In this paper, we develop a simple yet practical framework to efficiently craft targeted transfer-based adversarial examples. Specifically, we propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes by simply altering the class embedding and share a single backbone. Extensive experiments demonstrate that our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods -- it reaches an average success rate of 29.6\% against six diverse models based only on one substitute white-box model in the standard testing of NeurIPS 2017 competition, which outperforms the state-of-the-art gradient-based attack methods (with an average success rate of $<$2\%) by a large margin. Moreover, the proposed method is also more efficient beyond an order of magnitude than gradient-based methods.
翻译:以转移为主的对抗性攻击可以有效地评价黑箱环境中的模型稳健性。虽然有几种方法显示非目标对抗性例子的可转移性令人印象深刻,但有针对性的对抗性转移性仍然具有挑战性。现有的方法不是具有低目标转移性,就是牺牲计算效率。在本文件中,我们制定了一个简单而实用的框架,以便高效率地制定有针对性的转移性对抗性攻击性例子。具体地说,我们提出了一个有条件的基因化攻击性攻击模式,通过简单地改变等级嵌入和共用一个单一骨干来生成针对不同类别的对抗性攻击性例子。广泛的实验表明,我们的方法通过比现有方法有很大的利润幅度改进了定向黑箱攻击的成功率。 此外,在标准测试NeurIPS 2017竞争时,仅以一个替代的白箱模式为基础,这种模式比基于梯度的先进攻击方法(平均成功率为2美元以下)。此外,拟议的方法也比基于梯度方法的规模要高得多。