Deep Neural Networks have been shown to be vulnerable to adversarial images. Conventional attacks strive for indistinguishable adversarial images with strictly restricted perturbations. Recently, researchers have moved to explore distinguishable yet non-suspicious adversarial images and demonstrated that color transformation attacks are effective. In this work, we propose Adversarial Color Filter (AdvCF), a novel color transformation attack that is optimized with gradient information in the parameter space of a simple color filter. In particular, our color filter space is explicitly specified so that we are able to provide a systematic analysis of model robustness against adversarial color transformations, from both the attack and defense perspectives. In contrast, existing color transformation attacks do not offer the opportunity for systematic analysis due to the lack of such an explicit space. We further conduct extensive comparisons between different color transformation attacks on both the success rate and image acceptability, through a user study. Additional results provide interesting new insights into model robustness against AdvCF in another three visual tasks. We also highlight the human-interpretability of AdvCF, which is promising in practical use scenarios, and show its superiority over the state-of-the-art human-interpretable color transformation attack on both the image acceptability and efficiency.
翻译:深心神经网络被证明易受对抗图像的伤害。 常规攻击是为了在严格限制的扰动下,对无法区分的对抗性对抗图像进行系统分析。 最近, 研究人员开始探索可辨别但非可疑的对抗性对抗图像, 并表明色彩转换攻击是有效的。 在这项工作中, 我们提议了Adversarial 彩色过滤器( AdvCF), 这是一种新型的彩色转换攻击, 在简单彩色过滤器的参数空间里, 以梯度信息优化。 特别是, 我们的彩色过滤空间被明确指定, 以便我们能够从攻击和防御的角度, 提供对对抗对抗敌对色彩变化的模型强度的系统分析。 相比之下, 现有的彩色转换攻击由于缺少这种清晰的空间, 没有提供系统分析的机会。 我们通过用户研究, 对成功率和图像可接受性的不同彩色转换攻击进行广泛的比较。 额外的结果为在另外三项视觉任务中针对AdvCFCF的强度模型提供了令人感兴趣的新见解。 我们还强调AdvC的人类互动性, 它在实际使用情景中很有希望, 并且显示其高于图像攻击的效果。