Black-Box 分部门:争取消除基于硬标签模型的盗用攻击 (Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack)

Previous studies have verified that the functionality of black-box models can be stolen with full probability outputs. However, under the more practical hard-label setting, we observe that existing methods suffer from catastrophic performance degradation. We argue this is due to the lack of rich information in the probability prediction and the overfitting caused by hard labels. To this end, we propose a novel hard-label model stealing method termed \emph{black-box dissector}, which consists of two erasing-based modules. One is a CAM-driven erasing strategy that is designed to increase the information capacity hidden in hard labels from the victim model. The other is a random-erasing-based self-knowledge distillation module that utilizes soft labels from the substitute model to mitigate overfitting. Extensive experiments on four widely-used datasets consistently demonstrate that our method outperforms state-of-the-art methods, with an improvement of at most $8.27\%$. We also validate the effectiveness and practical potential of our method on real-world APIs and defense methods. Furthermore, our method promotes other downstream tasks, \emph{i.e.}, transfer adversarial attacks.

翻译：先前的研究已经证实黑箱模型的功能可以完全概率产出被偷。然而,在更实用的硬标签设置下,我们观察到现有方法存在灾难性性能退化。我们争辩说,这是因为概率预测缺乏丰富的信息,硬标签造成过大。为此,我们提议采用新的硬标签盗窃模型方法,称为emph{black-box discult},该方法由两个基于淘汰的模块组成。一个是CAM驱动的淘汰战略,旨在增加受害者模型硬标签中隐藏的信息能力。另一个是随机淘汰的自知蒸馏模块,利用替代模型的软标签来减轻过度装配。关于四套广泛使用的数据集的广泛实验不断表明,我们的方法比目前最先进的方法要好,其改进幅度为8.27美元。我们还验证了我们的方法在现实世界API和防御方法上的有效性和实际潜力。此外,我们的方法还促进其他下游任务,即对抗性攻击。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日