Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion. In this paper, we propose a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, assuming only loss-oracle access to the black-box network. The quantified visual distortion, which measures the perceptual distance between the adversarial example and the original image, is introduced in our loss whilst the gradient of the corresponding non-differentiable loss function is approximated by sampling noise from the learned noise distribution. We validate the effectiveness of our attack on ImageNet. Our attack results in much lower distortion when compared to the state-of-the-art black-box attacks and achieves $100\%$ success rate on InceptionV3, ResNet50 and VGG16bn. The code is available at https://github.com/Alina-1997/visual-distortion-in-attack.
翻译:在黑盒威胁模型中构建对抗性实例会通过引入视觉扭曲而损害原始图像。 在本文中,我们建议采用新的黑盒攻击方法,通过了解对抗性例子的噪音分布,可以直接将诱发扭曲降到最低程度,假设只能使用黑盒网络的损耗-oracle。 量化的视觉扭曲测量了对抗性例子与原始图像之间的概念距离,在我们的损失中引入了量化的视觉扭曲,而相应的不可区分的损失功能的梯度则通过从所学的噪音分布中取样的噪音来近似。 我们验证了我们对图像网络的攻击的有效性。 与最先进的黑盒攻击相比,我们的攻击导致的扭曲程度要低得多, InpeptionionV3、ResNet50和VGG16bn的成功率要达到100美元。 该代码可在https://github.com/Alina-97/vivical-dortricon-in-in-freat中查阅。