Recent studies reveal that Convolutional Neural Networks (CNNs) are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. Many adversarial defense methods improve robustness at the cost of accuracy, raising the contradiction between standard and adversarial accuracies. In this paper, we observe an interesting phenomenon that feature statistics change monotonically and smoothly w.r.t the rising of attacking strength. Based on this observation, we propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength. This is done by predicting a fusing weight in a dual-BN architecture. Unlike previous works that need to either retrain the model or manually tune a hyper-parameters for different attacking strengths, our method can deal with arbitrary attacking strengths with a single model without introducing any hyper-parameter. Importantly, our method improves the model robustness against adversarial samples without incurring much loss in standard accuracy. Experiments on CIFAR-10, SVHN, and tiny-ImageNet datasets demonstrate that our method outperforms the state-of-the-art under a wide range of attacking strengths.
翻译:最近的研究表明,进化神经网络(CNNs)通常容易受到对抗性攻击,对安全敏感应用构成威胁。许多对抗性防御方法以准确性为代价提高了强力性,提高了标准与对抗偏差之间的矛盾。在本文中,我们观察到一个有趣的现象,其特征是统计数据单步和顺利地改变攻击强度的上升。根据这一观察,我们建议适应性特征调整(AFA)以产生任意攻击力的特征。我们的方法经过培训,可以自动调整任意攻击强度的特点。这是通过预测双BN结构中的阻力重量来完成的。与以前需要重新配置模型或手动调整超参数以适应不同攻击强度的工作不同,我们的方法可以用单一模型处理任意攻击强力,而不会引入任何超光谱。重要的是,我们的方法在不造成标准精确性损失的情况下改进了对抗性攻击样品的模型的强度。对CIFAR-10、SVHN和小型ImageNet的实验表明,我们的方法超越了在大规模攻击能力的范围。