While it has long been empirically observed that adversarial robustness may be at odds with standard accuracy and may have further disparate impacts on different classes, it remains an open question to what extent such observations hold and how the class imbalance plays a role within. In this paper, we attempt to understand this question of accuracy disparity by taking a closer look at linear classifiers under a Gaussian mixture model. We decompose the impact of adversarial robustness into two parts: an inherent effect that will degrade the standard accuracy on all classes, and the other caused by the class imbalance ratio, which will increase the accuracy disparity compared to standard training. Furthermore, we also extend our model to the general family of stable distributions. We demonstrate that while the constraint of adversarial robustness consistently degrades the standard accuracy in the balanced class setting, the class imbalance ratio plays a fundamentally different role in accuracy disparity compared to the Gaussian case, due to the heavy tail of the stable distribution. We additionally perform experiments on both synthetic and real-world datasets. The empirical results not only corroborate our theoretical findings, but also suggest that the implications may extend to nonlinear models over real-world datasets.
翻译:长期以来,人们从经验上观察到,对抗性强力可能与标准准确性不符,而且可能对不同类别产生进一步不同的影响,但对于这种观察结果的维持程度以及阶级不平衡如何在内部发挥作用,这仍然是一个未决问题。在本文件中,我们试图通过更仔细地研究高斯混合模式下的线性分类者来理解准确性差异问题。我们分解了对抗性强力的影响,分为两部分:一种内在效应,会降低所有类别的标准准确性,而另一部分则由阶级不平衡率造成,这将增加与标准培训的准确性差距。此外,我们还将我们的模型推广到稳定分布的大家庭。我们表明,虽然对抗性强力的制约不断降低平衡阶级设置中的标准准确性,但由于分布很不稳定,与高斯案例的准确性差距有着根本不同的作用。我们还在合成和真实世界数据集上进行实验。我们的经验结果不仅证实了我们的理论结论,而且还表明,这些影响可能会扩大到非线性模型。