Deep neural networks (DNNs) are known to be prone to adversarial attacks, for which many remedies are proposed. While adversarial training (AT) is regarded as the most robust defense, it suffers from poor performance both on clean examples and under other types of attacks, e.g. attacks with larger perturbations. Meanwhile, regularizers that encourage uncertain outputs, such as entropy maximization (EntM) and label smoothing (LS) can maintain accuracy on clean examples and improve performance under weak attacks, yet their ability to defend against strong attacks is still in doubt. In this paper, we revisit uncertainty promotion regularizers, including EntM and LS, in the field of adversarial learning. We show that EntM and LS alone provide robustness only under small perturbations. Contrarily, we show that uncertainty promotion regularizers complement AT in a principled manner, consistently improving performance on both clean examples and under various attacks, especially attacks with large perturbations. We further analyze how uncertainty promotion regularizers enhance the performance of AT from the perspective of Jacobian matrices $\nabla_X f(X;\theta)$, and find out that EntM effectively shrinks the norm of Jacobian matrices and hence promotes robustness.
翻译:众所周知,深心神经网络(DNNS)容易发生对抗性攻击,对此提出了许多补救措施; 对抗性训练(AT)被认为是最有力的防御,但在清洁的例子和其他类型的攻击(如更大规模的扰动攻击)中,它的表现都很差; 同时,鼓励不确定产出的正规化者,如催化最大化(EntM)和标签平滑(LS),能够保持清洁实例的准确性,并在薄弱攻击中提高性能,然而,它们抵御强攻的能力仍然令人怀疑; 在本文中,我们在对抗性学习领域重新审视包括EntM和LS在内的不确定性促进规范规范化者。我们显示,光是EntM和LS本身只在小的扰动下提供了稳健性。相反,我们表明,不确定性促进规范化者以有原则的方式补充了AT,不断提高清洁实例和各种攻击(特别是大扰动式攻击)的绩效。我们进一步分析不确定性促进规范化者如何从Jacobian 矩阵 $\nabla_X f (X\the cobta) commlates degrodustrations 有效推进了AT的磁。