Adversarial training is the industry standard for producing models that are robust to small adversarial perturbations. However, machine learning practitioners need models that are robust to other kinds of changes that occur naturally, such as changes in the style or illumination of input images. Such changes in input distribution have been effectively modeled as shifts in the mean and variance of deep image features. We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce models that are robust to various unseen distributional shifts. We explore the relationship between these perturbations and distributional shifts by visualizing adversarial features. Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training. By fine-tuning neural networks on adversarial feature distributions, we observe improved robustness of networks to various unseen distributional shifts, including style variations and image corruptions. In addition, we show that our proposed adversarial feature perturbation can be complementary to existing image space data augmentation methods, leading to improved performance. The source code and pre-trained models are released at \url{https://github.com/azshue/AdvBN}.
翻译:Adversari 培训是制作对小型对抗性扰动具有强力作用的模型的行业标准,然而,机器学习实践者需要具有对其他自然变化(例如输入图像的风格或光化的改变)具有强力作用的模型。投入分布的变化是随着深层图像特征的平均值和差异的变化而有效建模的。我们通过直接扰动特征统计而不是图像像素来调整对抗性培训,以产生对各种看不见分布变化具有强力作用的模型。我们探索了这些扰动与通过视觉化对抗性特征进行分布变化之间的关系。我们建议的方法Advaral Bass Alcialization(AdvBN)是一个单一的网络层,在培训期间产生最坏情况特征的干扰。我们通过微调关于对抗性特征分布的神经网络,观察到各种看不见的分布变化(包括风格变异和图像腐败)的网络更加强大。此外,我们表明,我们提议的对抗性特征的渗透性特征可以补充现有的图像空间数据增强方法,导致性能的改进。源码和前Aburz{BAR{BAR}模型在改进了。