Segmentation models have been found to be vulnerable to targeted and non-targeted adversarial attacks. However, the resulting segmentation outputs are often so damaged that it is easy to spot an attack. In this paper, we propose semantically stealthy adversarial attacks which can manipulate targeted labels while preserving non-targeted labels at the same time. One challenge is making semantically meaningful manipulations across datasets and models. Another challenge is avoiding damaging non-targeted labels. To solve these challenges, we consider each input image as prior knowledge to generate perturbations. We also design a special regularizer to help extract features. To evaluate our model's performance, we design three basic attack types, namely `vanishing into the context,' `embedding fake labels,' and `displacing target objects.' Our experiments show that our stealthy adversarial model can attack segmentation models with a relatively high success rate on Cityscapes, Mapillary, and BDD100K. Our framework shows good empirical generalization across datasets and models.
翻译:发现分类模型易受定向和非定向对抗性攻击的伤害。 但是, 由此产生的分解结果往往被破坏到很容易发现攻击。 在本文中, 我们提议进行隐性对抗性攻击, 可以同时操纵定向标签, 同时保留非目标标签。 一个挑战是使跨数据集和模型的操作具有语义意义。 另一个挑战是避免破坏非目标标签。 为了解决这些挑战, 我们把每个输入图像视为先前的知识, 以产生扰动。 我们还设计了一个特殊的定序器来帮助提取特征。 为了评估我们的模型的性能, 我们设计了三种基本攻击类型, 即“ 切换到上下文, 混合假标签 ” 和“ 分散目标对象 。 我们的实验显示, 我们的隐性对抗性模型可以攻击分解模型, 在城市景象、 Mapilly 和 BDDD100K 上的成功率相对较高。 我们的框架展示了在数据集和模型上的良好经验概括。