As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models. A key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. Our goal is to establish a standardized benchmark of adversarial robustness, which as accurately as possible reflects the robustness of the considered models within a reasonable computational budget. To this end, we start by considering the image classification task and introduce restrictions (possibly loosened in the future) on the allowed models. We evaluate adversarial robustness with AutoAttack, an ensemble of white- and black-box attacks, which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications. To prevent overadaptation of new defenses to AutoAttack, we welcome external evaluations based on adaptive attacks, especially where AutoAttack flags a potential overestimation of robustness. Our leaderboard, hosted at https://robustbench.github.io/, contains evaluations of 120+ models and aims at reflecting the current state of the art in image classification on a set of well-defined tasks in $\ell_\infty$- and $\ell_2$-threat models and on common corruptions, with possible extensions in the future. Additionally, we open-source the library https://github.com/RobustBench/robustbench that provides unified access to 80+ robust models to facilitate their downstream applications. Finally, based on the collected models, we analyze the impact of robustness on the performance on distribution shifts, calibration, out-of-distribution detection, fairness, privacy leakage, smoothness, and transferability.
翻译:作为研究界,我们仍缺乏对对抗性稳健性进展的系统理解,这往往难以确定在培训稳健模型方面最有希望的想法。基准稳健性的一个关键挑战是,其评价往往容易出错,导致高估稳健度。我们的目标是建立一个标准化的对抗性稳健性基准,尽可能准确地反映在合理计算预算范围内所考虑的模型的稳健性。为此,我们首先考虑图像分类任务,对允许模式实行限制(可能在未来有所放松),这往往使得难以确定在培训稳健模型方面最有希望的想法。我们评价AutoAttack的对抗性强性强性,白箱和黑箱袭击的连锁性,最近的一项大规模研究显示,目的是与原始出版物相比,改进几乎所有稳健性评估。为了防止对AutoAttack的新防御力过度适应,我们欢迎基于适应性攻击的外部评价,特别是AutoAttack公司将潜在的稳健性调整为公开性,对稳健性做出评估。我们的上台平台在https://robetbench.slimental-reallish2, comliveralalalalalalalalalalalalalalalallievation lievationalalalal lievation lievationslation, lievation, liveralalaldddddddddationslations baldddddddddddddddddddationslationsmationslations 20 20 20dationsationsationsaldational 20dddddddddddddddddddddddddddddddddddddddaldaldddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddds,在我们提供,在