A growing body of work has shown that deep neural networks are susceptible to adversarial examples. These take the form of small perturbations applied to the model's input which lead to incorrect predictions. Unfortunately, most literature focuses on visually imperceivable perturbations to be applied to digital images that often are, by design, impossible to be deployed to physical targets. We present Adversarial Scratches: a novel L0 black-box attack, which takes the form of scratches in images, and which possesses much greater deployability than other state-of-the-art attacks. Adversarial Scratches leverage B\'ezier Curves to reduce the dimension of the search space and possibly constrain the attack to a specific location. We test Adversarial Scratches in several scenarios, including a publicly available API and images of traffic signs. Results show that, often, our attack achieves higher fooling rate than other deployable state-of-the-art methods, while requiring significantly fewer queries and modifying very few pixels.
翻译:越来越多的工作表明,深神经网络很容易受到对抗性的例子的影响。 其形式是模型输入中的小扰动,导致不正确的预测。 不幸的是,大多数文献都侧重于在数字图像中应用的可见性扰动,而从设计上看,这些图像往往无法用于物理目标。 我们展示了Aversarial Scratche:新型的L0黑盒攻击,其形式是图像中的刮痕,其部署能力比其他最先进的攻击要高得多。 Aversarial Scratches 利用 B\'ezier Curves 来减少搜索空间的尺寸,并可能将攻击限制到某个特定地点。 我们用几种情景来测试Aversarial Scratches, 包括公开提供的API和交通标志图像。 结果显示,我们的攻击比其他可部署的状态方法的愚弄率要高得多,同时需要大量查询和修改很少的像素。