While it is shown in the literature that simultaneously accurate and robust classifiers exist for common datasets, previous methods that improve the adversarial robustness of classifiers often manifest an accuracy-robustness trade-off. We build upon recent advancements in data-driven ``locally biased smoothing'' to develop classifiers that treat benign and adversarial test data differently. Specifically, we tailor the smoothing operation to the usage of a robust neural network as the source of robustness. We then extend the smoothing procedure to the multi-class setting and adapt an adversarial input detector into a policy network. The policy adaptively adjusts the mixture of the robust base classifier and a standard network, where the standard network is optimized for clean accuracy and is not robust in general. We provide theoretical analyses to motivate the use of the adaptive smoothing procedure, certify the robustness of the smoothed classifier under realistic assumptions, and justify the introduction of the policy network. We use various attack methods, including AutoAttack and adaptive attack, to empirically verify that the smoothed model noticeably improves the accuracy-robustness trade-off. On the CIFAR-100 dataset, our method simultaneously achieves an 80.09\% clean accuracy and a 32.94\% AutoAttacked accuracy. The code that implements adaptive smoothing is available at https://github.com/Bai-YT/AdaptiveSmoothing.
翻译:文献显示,共同数据集同时存在准确和稳健的分类方法,而以前改进分类者的对抗性稳健性的方法往往显示准确和粗糙的权衡。我们利用最近数据驱动的“局部偏差平滑”的进展,开发对良性和对抗性测试数据有不同处理的分类方法。具体地说,我们调整平滑操作,将强健的神经网络用作稳健的源头。然后,我们将平滑程序扩大到多级设置,并将对抗性输入检测器调整到政策网络中。该政策适应性地调整了强健的基础分类器和标准网络的组合,使标准网络优化为清洁准确性,而一般不健全。我们提供理论分析,鼓励使用适应性平滑程序,在现实假设下验证平滑的分类器的稳健性,并证明采用政策网络是合理的。我们使用各种攻击方法,包括“自动控制”和“适应性攻击”,以实际核查平稳模型明显改进准确性、准确性交易和标准网络的混合精确性。AOn-Anal-BAR-As a cleanstalstalstalstal a a rodudestrutstitational-tabal-taking atotostititititation。