Several existing works study either adversarial or natural distributional robustness of deep neural networks separately. In practice, however, models need to enjoy both types of robustness to ensure reliability. In this work, we bridge this gap and show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness. We first consider a simple linear regression setting on Gaussian data with disjoint sets of core and spurious features. In this setting, through theoretical and empirical analysis, we show that (i) adversarial training with $\ell_1$ and $\ell_2$ norms increases the model reliance on spurious features; (ii) For $\ell_\infty$ adversarial training, spurious reliance only occurs when the scale of the spurious features is larger than that of the core features; (iii) adversarial training can have an unintended consequence in reducing distributional robustness, specifically when spurious correlations are changed in the new test domain. Next, we present extensive empirical evidence, using a test suite of twenty adversarially trained models evaluated on five benchmark datasets (ObjectNet, RIVAL10, Salient ImageNet-1M, ImageNet-9, Waterbirds), that adversarially trained classifiers rely on backgrounds more than their standardly trained counterparts, validating our theoretical results. We also show that spurious correlations in training data (when preserved in the test domain) can improve adversarial robustness, revealing that previous claims that adversarial vulnerability is rooted in spurious correlations are incomplete.
翻译:若干现有作品分别研究深层神经网络的对抗性或自然分布强度。 但是,在实践中,模型需要享受两种类型的稳健性,以确保可靠性。 在这项工作中,我们缩小了这一差距,并表明,事实上,对抗性和自然分配强度之间存在明显的权衡。 我们首先考虑对高斯数据进行简单的线性回归设置,其核心特征和虚假特征各组不连贯。 在这种背景下,我们通过理论和经验分析,表明:(一) 以1美元和2美元标准提供的对抗性培训增加了对虚假特征的模型依赖;(二) 对于美元/ell_infty$的对抗性培训,只有在虚假特征的规模大于核心特征时,才出现过度依赖;(三) 对抗性培训可能会无意地降低分配强度,特别是在新的测试领域对具有误导性的关联性时。 其次,我们提出广泛的实证证据,在五个基准数据集(Object Net, RIVAL10, Salentinfyal effyal trial commissional reports) 上,我们经过培训的准确性图像-9immalstital immal exports report 能够证明我们经过验证的对等数据进行了有效的测试。