Adversarial examples can be used to maliciously and covertly change a model's prediction. It is known that an adversarial example designed for one model can transfer to other models as well. This poses a major threat because it means that attackers can target systems in a blackbox manner. In the domain of transferability, researchers have proposed ways to make attacks more transferable and to make models more robust to transferred examples. However, to the best of our knowledge, there are no works which propose a means for ranking the transferability of an adversarial example in the perspective of a blackbox attacker. This is an important task because an attacker is likely to use only a select set of examples, and therefore will want to select the samples which are most likely to transfer. In this paper we suggest a method for ranking the transferability of adversarial examples without access to the victim's model. To accomplish this, we define and estimate the expected transferability of a sample given limited information about the victim. We also explore practical scenarios: where the adversary can select the best sample to attack and where the adversary must use a specific sample but can choose different perturbations. Through our experiments, we found that our ranking method can increase an attacker's success rate by up to 80% compared to the baseline (random selection without ranking).
翻译:adversari 示例可用于恶意和隐蔽地改变模型的预测。 众所周知, 为一个模型设计的对抗性示例也可以转移到其他模型。 这构成了重大威胁, 因为它意味着攻击者可以黑盒方式瞄准系统。 在可转移性方面, 研究人员提出了使攻击更易转移和使模型对被转移实例更强的方法。 然而, 根据我们所知, 没有任何工作可以建议一种手段, 从黑盒攻击者的角度来排列一个对抗性示例的可转移性。 这是一项重要的任务, 因为攻击者可能只使用一组选定的示例, 因此要选择最有可能转移的样本。 在本文中, 我们建议了一种方法, 使对抗性实例的可转移性在不接触受害者模型的情况下进行排序。 为了达到这个目的, 我们定义和估计了样本的预期可转移性, 因为关于受害者的信息有限。 我们还探索了实际的情景: 对手可以选择攻击的最佳样本, 并且对手必须使用特定的样本, 但可以选择不同的过度。 通过我们的实验, 我们找到了一种排序方法, 将我们的排序比作80 。