Great advances in deep neural networks (DNNs) have led to state-of-the-art performance on a wide range of tasks. However, recent studies have shown that DNNs are vulnerable to adversarial attacks, which have brought great concerns when deploying these models to safety-critical applications such as autonomous driving. Different defense approaches have been proposed against adversarial attacks, including: a) empirical defenses, which can usually be adaptively attacked again without providing robustness certification; and b) certifiably robust approaches, which consist of robustness verification providing the lower bound of robust accuracy against any attacks under certain conditions and corresponding robust training approaches. In this paper, we systematize certifiably robust approaches and related practical and theoretical implications and findings. We also provide the first comprehensive benchmark on existing robustness verification and training approaches on different datasets. In particular, we 1) provide a taxonomy for the robustness verification and training approaches, as well as summarize the methodologies for representative algorithms, 2) reveal the characteristics, strengths, limitations, and fundamental connections among these approaches, 3) discuss current research progresses, theoretical barriers, main challenges, and future directions for certifiably robust approaches for DNNs, and 4) provide an open-sourced unified platform to evaluate 20+ representative certifiably robust approaches.
翻译:深度神经网络在各种任务上取得了最先进的性能。然而,近期的研究表明,深度神经网络容易受到对抗性攻击的影响,这给像自动驾驶这样的安全关键应用程序的部署带来了重大的担忧。不同的防御方法已经针对对抗性攻击进行了提出,包括:a)经验性防御,这种方法通常可以被再次自适应攻击而没有提供鲁棒性认证;和b)可证明具有鲁棒性的方法,这些方法包括对具有一定条件的任何攻击提供鲁棒性验证和相应的鲁棒性训练方法,这种方法提供鲁棒准确性的下限。在本文中,我们系统地总结了该领域中的可证明具有鲁棒性的方法以及相关的实践和理论影响和发现。我们还在不同数据集上提供了现有鲁棒性验证和训练方法的第一综合基准。特别地,我们:1)提供了可证明具有鲁棒性方法的分类系统以及总结代表性算法的方法,2)揭示了这些方法之间的特点、优点、局限性和基本联系,3)讨论了当前研究进展、理论障碍、主要挑战和未来可证明深度神经网络鲁棒方法的研究方向,以及4)提供了一个开放式的统一平台来评估20多种代表性的可证明具有鲁棒性的方法。