Recent research in robust optimization has shown an overfitting-like phenomenon in which models trained against adversarial attacks exhibit higher robustness on the training set compared to the test set. Although previous work provided theoretical explanations for this phenomenon using a robust PAC-Bayesian bound over the adversarial test error, related algorithmic derivations are at best only loosely connected to this bound, which implies that there is still a gap between their empirical success and our understanding of adversarial robustness theory. To close this gap, in this paper we consider a different form of the robust PAC-Bayesian bound and directly minimize it with respect to the model posterior. The derivation of the optimal solution connects PAC-Bayesian learning to the geometry of the robust loss surface through a Trace of Hessian (TrH) regularizer that measures the surface flatness. In practice, we restrict the TrH regularizer to the top layer only, which results in an analytical solution to the bound whose computational cost does not depend on the network depth. Finally, we evaluate our TrH regularization approach over CIFAR-10/100 and ImageNet using Vision Transformers (ViT) and compare against baseline adversarial robustness algorithms. Experimental results show that TrH regularization leads to improved ViT robustness that either matches or surpasses previous state-of-the-art approaches while at the same time requires less memory and computational cost.
翻译:最近对强力优化的研究显示,与测试组相比,针对对抗性攻击所培训的模式在培训中表现出的强力强度比测试组更为强。虽然以前的工作为这一现象提供了理论解释,但利用强大的PAC-Bayesian在对抗性测试错误上捆绑,相关的算法衍生最多只能与这一约束有松散的联系,这意味着其经验成功与我们对对抗性强力理论的理解之间仍然存在着差距。为了缩小这一差距,我们认为,与测试组相比,强力PAC-Bayesian受约束的模型在培训中表现得更为有力。尽管最佳解决方案的衍生将PAC-Bayesian学习与强力损失表面的几何学相挂钩,但通过测量海珊(TrH)常规化,衡量表面平板化程度的轨迹。在实践中,我们将TrH正规化器仅局限于顶层,从而得出一种分析解决方案,其计算成本并不取决于网络深度。最后,我们评估了我们对CFAR-10-100和图像网的正规化方法,而利用高压性水平的图像网络,同时对比前的硬性水平,对比和高压性机率,显示前高压的硬性变压的升级的升级的升级,要求比标值。