Adversaries are capable of adding perturbations to an image to fool machine learning models into incorrect predictions. One approach to defending against such perturbations is to apply image preprocessing functions to remove the effects of the perturbation. Existing approaches tend to be designed orthogonally to the content of the image and can be beaten by adaptive attacks. We propose a novel image preprocessing technique called Essential Features that transforms the image into a robust feature space that preserves the main content of the image while significantly reducing the effects of the perturbations. Specifically, an adaptive blurring strategy that preserves the main edge features of the original object along with a k-means color reduction approach is employed to simplify the image to its k most representative colors. This approach significantly limits the attack surface for adversaries by limiting the ability to adjust colors while preserving pertinent features of the original image. We additionally design several adaptive attacks and find that our approach remains more robust than previous baselines. On CIFAR-10 we achieve 64% robustness and 58.13% robustness on RESISC45, raising robustness by over 10% versus state-of-the-art adversarial training techniques against adaptive white-box and black-box attacks. The results suggest that strategies that retain essential features in images by adaptive processing of the content hold promise as a complement to adversarial training for boosting robustness against adversarial inputs.
翻译:相对而言,可以对图像增加扰动,将机器学习模型的图像添加扰动,使机器学习模型变成不正确的预测。 防范这种扰动的一种方法是应用图像预处理功能来消除扰动的影响。 现有方法往往被设计成对图像内容的任意性,并且可以被适应性攻击击败。 我们提出一种新的图像预处理技术,称为“基本特性”,将图像转换成一个坚固的功能空间,保存图像的主要内容,同时显著减少扰动的影响。 具体地说,采用适应性模糊战略,将原始对象的主要边缘特征与K-手段减少颜色方法一起加以保护,将图像的图像简化为最有代表性的颜色。 这种方法通过限制调整颜色的能力,同时保留原始图像的相关特征,极大地限制对手的攻击面。 我们还设计了几起适应性攻击,发现我们的方法比以往的基线更加坚固。 在CIFAR-10上,我们实现了64%的稳健性和58.13%的稳健性对REISC45的抗御动性, 提高10%以上的强度,而将黑人相对于州- 动力性推进性攻击的调动性调动性战略。 将适应性培训工具保持了适应性培训工具,以保持了适应性标准,从而保持了基准,从而保持了基准,从而保持了基本的平调制式的图像。