The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is that this trade-off is inherent, either empirically or theoretically. Thus, we dig for the origin of this trade-off in adversarial training and find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance -- an overcorrection towards smoothness. Given this, we advocate employing local equivariance to describe the ideal behavior of a robust model, leading to a self-consistent robust error named SCORE. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty via robust optimization. By simply substituting KL divergence with variants of distance metrics, SCORE can be efficiently minimized. Empirically, our models achieve top-rank performance on RobustBench under AutoAttack. Besides, SCORE provides instructive insights for explaining the overfitting phenomenon and semantic input gradients observed on robust models. Code is available at https://github.com/P2333/SCORE.
翻译:强性和准确性之间的权衡在对抗性文献中已经进行了广泛的研究。虽然仍然有争议的,但普遍的看法是,这种权衡是内在的,无论是经验上的还是理论上的。因此,我们挖掘了对抗性培训中这种权衡的起源,并发现它可能源于定义不当的强性错误,这种错误造成地方偏差的演化偏差 -- -- 过度纠正,偏向于平滑。有鉴于此,我们主张使用地方的偏差来描述强性模型的理想行为,从而导致一个自成一体的强性错误,称为 SCORE。根据定义,SCORE促进稳性与准确性之间的调和,同时仍然通过强力优化处理最坏的不确定性。只要简单地用距离参数替代KL差异,SCORE就可以有效地最小化。有规律地说,我们的模型在AutAttack下的RobustbustBench上取得顶级的性表现。此外,SCOREARE还提供有启发性的洞察力的洞察力,解释过分适应的现象和在强性模型上观察到的语义性输入梯度梯度梯度。《守则》可在http://githubthub.com/P2333/SCO。