Segmentation models have been found to be vulnerable to targeted/non-targeted adversarial attacks. However, damaged predictions make it easy to unearth an attack. In this paper, we propose semantically stealthy adversarial attacks which can manipulate targeted labels as designed and preserve non-targeted labels at the same time. In this way, we may hide the corresponding attack behaviors. One challenge is making semantically meaningful manipulations across datasets/models. Another challenge is avoiding damaging non-targeted labels. To solve the above challenges, we consider each input image as prior knowledge to generate perturbations. We also design a special regularizer to help extract features. To evaluate our model's performance, we design three basic attack types, namely `vanishing into the context', `embedding fake labels', and `displacing target objects'. The experiments show that our stealthy adversarial model can attack segmentation models with a relatively high success rate on Cityscapes, Mapillary, and BDD100K. Finally, our framework also shows good generalizations across datasets/models empirically.
翻译:人们发现,分类模型很容易受到目标/非目标对称攻击的攻击。 但是, 损坏的预测使得很容易发现攻击。 在本文中, 我们提出隐性隐性对抗性攻击, 可以同时将目标标签作为设计并保存非目标标签。 这样, 我们可能隐藏相应的攻击行为。 一项挑战是使数据集/ 模型之间发生具有语义意义的操作。 另一个挑战是避免损坏的非目标标签。 为了解决上述挑战, 我们把每个输入图像视为先前的知识, 以产生扰动。 我们还设计了一种特殊的定序器来帮助提取特征。 为了评估我们的模型的性能, 我们设计了三种基本的攻击类型, 即“ 切换到环境”、“ 编造假标签” 和“ 变异目标对象 ” 。 实验显示, 我们的隐性对抗性对立性模型可以攻击分离模型, 在城市景象、 Mapilly 和 BDDD100K 上的成功率相对较高。 最后, 我们的框架还展示了跨数据设置/ 模型经验的很好的概括性。