While adversarial perturbation of images to attack deep image classification models pose serious security concerns in practice, this paper suggests a novel paradigm where the concept of image perturbation can benefit classification performance, which we call amicable aid. We show that by taking the opposite search direction of perturbation, an image can be modified to yield higher classification confidence and even a misclassified image can be made correctly classified. This can be also achieved with a large amount of perturbation by which the image is made unrecognizable by human eyes. The mechanism of the amicable aid is explained in the viewpoint of the underlying natural image manifold. Furthermore, we investigate the universal amicable aid, i.e., a fixed perturbation can be applied to multiple images to improve their classification results. While it is challenging to find such perturbations, we show that making the decision boundary as perpendicular to the image manifold as possible via training with modified data is effective to obtain a model for which universal amicable perturbations are more easily found.
翻译:虽然对图像进行对抗性扰动以攻击深图像分类模型在实践中造成严重的安全关切,但本文提出了一种新的范例,即图像扰动概念能够有利于分类性能,我们称之为友好援助。我们表明,通过采用相反的扰动搜索方向,图像可以被修改,以产生更高的分类信心,甚至可以对错误分类图像进行正确的分类。这也可以通过大量干扰使图像为人类眼睛所无法辨认而实现。友好援助机制在自然图象的深层角度加以解释。此外,我们调查普遍友好援助,即固定扰动可应用于多个图像,以改善其分类结果。虽然很难找到这种扰动,但我们表明,通过对修改数据进行培训,使决策边界与图像的成形体相容是有效的,以便获得一个更容易找到普遍友好扰动的模型。</s>