The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is that this trade-off is inherent, either empirically or theoretically. Thus, we dig for the origin of this trade-off in adversarial training and find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance -- an overcorrection towards smoothness. Given this, we advocate employing local equivariance to describe the ideal behavior of a robust model, leading to a self-consistent robust error named SCORE. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty via robust optimization. By simply substituting KL divergence with variants of distance metrics, SCORE can be efficiently minimized. Empirically, our models achieve top-rank performance on RobustBench under AutoAttack. Besides, SCORE provides instructive insights for explaining the overfitting phenomenon and semantic input gradients observed on robust models.
翻译:强性和准确性之间的权衡在对抗性文献中已经进行了广泛的研究。虽然仍然有争议的,但普遍的看法是,这种权衡是内在的,无论是经验上的还是理论上的。因此,我们挖掘了对抗性培训中这种权衡的起源,并发现它可能源于定义不当的强性错误,这种错误造成地方偏差的演化偏差 -- -- 一种对平稳的过度纠正。有鉴于此,我们主张使用地方的偏差来描述强性模型的理想行为,从而导致一个自成一体的强性错误,称为 SCORE。根据定义,SCORE促进强性和准确性之间的调和,同时仍然通过强力优化处理最坏的不确定情况。简单地取代KL的偏差,SCORE就可以有效地最小化距离参数的变差。有规律地说,我们的模型在AutAttack下的RobustbustBench上实现了顶级的性表现。此外,SCORED为解释过分适应的现象和在强性模型上观察到的语义输入梯提供了启发性见解。