We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% accuracy in the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack, which may offer insights into the reasons underlying the vulnerability of ALP to adversarial attack.
翻译:我们评估了Adversarial Logit Pairing的稳健性,这是最近针对对抗性例子提出的辩护建议。我们发现,一个接受过Adversarial Logit Pairing培训的网络在考虑辩护的威胁模式中达到了0.6%的准确度。我们简要概述了辩护和考虑的威胁模式/主张,并讨论了我们攻击的方法和结果,这可能有助于深入了解ALP易受对抗性攻击的原因。