引发自动物理对立机器学习攻击的实用框架 (GRAPHITE: A Practical Framework for Generating Automatic Physical Adversarial Machine Learning Attacks)

This paper investigates an adversary's ease of attack in generating adversarial examples for real-world scenarios. We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i.e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in both white-box and black-box hard-label scenarios, so that the adversary can attack proprietary models. In particular, the art of automatically picking which areas to perturb remains largely unexplored -- an efficient solution would remove the need to search over possible locations, shapes, and sizes as in current patch attacks. In this work, we propose GRAPHITE, an efficient and general framework for generating attacks that satisfy the above three key requirements. GRAPHITE takes advantage of transform-robustness, a metric based on expectation over transforms (EoT), to automatically generate small masks and optimize with gradient-free optimization. GRAPHITE is also flexible as it can easily trade-off transform-robustness, perturbation size, and query count in black-box settings. On a GTSRB model in a hard-label black-box setting, we are able to find attacks on all possible 1,806 victim-target class pairs with averages of 77.8% transform-robustness, perturbation size of 16.63% of the victim images, and 126K queries per pair. For digital-only attacks where achieving transform-robustness is not a requirement, GRAPHITE is able to find successful small-patch attacks with an average of only 566 queries for 92.2% of victim-target pairs. GRAPHITE is also able to find successful attacks using perturbations that modify small areas of the input image against PatchGuard, a recently proposed defense against patch-based attacks.

翻译：本文调查了对手在为真实世界情景生成对抗性实例方面的攻击容易性。我们处理现实世界实际攻击的三个关键要求:(1) 自动限制攻击的规模和形状,以便用粘贴剂应用攻击的大小和形状;(2) 变压-紫色,即攻击环境物理变异(如观点和照明变化)的稳健性,以及(3) 支持白箱和黑盒硬标签情景中的攻击,以便对手可以攻击专有模型。特别是, 自动选择哪些区域仍然基本上没有爆炸的艺术 -- -- 一个高效的解决方案可以消除搜索可能攻击的规模、形状和形状的需要。在这项工作中,我们建议GRAPHITE, 一个高效和一般的框架,用来制造攻击,满足以上三个关键要求。 GRAPHITE, 利用变压式变压(EOT) 的衡量标准, 自动生成小型的面粉色面, 使用无色优化的面部。 GRAPITE 也可以灵活, 很容易进行变压- 变压的变压- breal- track-real trackal track a laves a list comst laft strup

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日