Most current classifiers are vulnerable to adversarial examples, small input perturbations that change the classification output. Many existing attack algorithms cover various settings, from white-box to black-box classifiers, but typically assume that the answers are deterministic and often fail when they are not. We therefore propose a new adversarial decision-based attack specifically designed for classifiers with probabilistic outputs. It is based on the HopSkipJump attack by Chen et al. (2019, arXiv:1904.02144v5 ), a strong and query efficient decision-based attack originally designed for deterministic classifiers. Our P(robabilisticH)opSkipJump attack adapts its amount of queries to maintain HopSkipJump's original output quality across various noise levels, while converging to its query efficiency as the noise level decreases. We test our attack on various noise models, including state-of-the-art off-the-shelf randomized defenses, and show that they offer almost no extra robustness to decision-based attacks. Code is available at https://github.com/cjsg/PopSkipJump .
翻译:目前大多数分类者都容易受到对抗性例子的影响,而输入干扰较小,从而改变分类输出。许多现有的攻击算法覆盖了从白箱到黑盒分类器等各种设置,但通常认为答案是决定性的,如果不是,往往会失败。因此,我们提议为具有概率输出的分类者专门设计一种新的对抗性决定式攻击。这是基于陈等人对HopSkip Jump的攻击(2019年,ArXiv:1904.042144v5),这是最初为确定性分类者设计的强有力和基于查询的高效决策攻击。我们的P(robabiticH)opSkip Jump攻击调整其查询数量,以保持HopSkip Jump在不同噪音水平上的原始输出质量,同时随着噪音水平的下降而趋同其查询效率。我们测试了我们对各种噪音模型的攻击,包括状态-艺术的现成随机防御,并显示它们对基于决定的攻击几乎没有额外的强健性。 https://gimpthub.Scomjjjs.