Deep models have shown their vulnerability when processing adversarial samples. As for the black-box attack, without access to the architecture and weights of the attacked model, training a substitute model for adversarial attacks has attracted wide attention. Previous substitute training approaches focus on stealing the knowledge of the target model based on real training data or synthetic data, without exploring what kind of data can further improve the transferability between the substitute and target models. In this paper, we propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process. More specifically, a diverse data generation module is proposed to synthesize large-scale data with wide distribution. And adversarial substitute training strategy is introduced to focus on the data distributed near the decision boundary. The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack. Extensive experiments demonstrate the efficacy of our method against state-of-the-art competitors under non-target and target attack settings. Detailed visualization and analysis are also provided to help understand the advantage of our method.
翻译:深层模型在处理对抗性样本时显示了其脆弱性。 至于黑箱攻击,没有机会接触被攻击模型的架构和重量,培训对抗性攻击的替代模型引起了广泛的关注。以前的替代培训方法侧重于偷窃基于实际培训数据或合成数据的目标模型知识,而没有探讨何种数据可以进一步提高替代模型和目标模型之间的可转移性。在本文中,我们提出了一个新颖的替代培训,侧重于设计知识盗窃过程中使用的数据的传播。更具体地说,我们建议了一个多样化的数据生成模块,以广泛传播综合大规模数据。还采用了对抗性替代培训战略,以侧重于在决定边界附近传播的数据。这两个模块的结合可以进一步提高替代模型和目标模型的一致性,从而大大提高了对抗性攻击的有效性。广泛的实验表明我们的方法在非目标攻击和目标攻击环境下对最先进的竞争者的有效性。还提供了详细的可视化和分析,以帮助理解我们方法的优势。