The field of adversarial robustness has attracted significant attention in machine learning. Contrary to the common approach of training models that are accurate in average case, it aims at training models that are accurate for worst case inputs, hence it yields more robust and reliable models. Put differently, it tries to prevent an adversary from fooling a model. The study of adversarial robustness is largely focused on $\ell_p-$bounded adversarial perturbations, i.e. modifications of the inputs, bounded in some $\ell_p$ norm. Nevertheless, it has been shown that state-of-the-art models are also vulnerable to other more natural perturbations such as affine transformations, which were already considered in machine learning within data augmentation. This project reviews previous work in spatial robustness methods and proposes evolution strategies as zeroth order optimization algorithms to find the worst affine transforms for each input. The proposed method effectively yields robust models and allows introducing non-parametric adversarial perturbations.
翻译:对抗性强力领域吸引了机器学习的极大关注。 与一般情况下准确的培训模式的共同方法相反,它的目标是培训对最坏案例投入准确的模型,从而产生更稳健和可靠的模型。 换句话说,它试图防止对手愚弄模型。 对抗性强力研究主要侧重于以$\ell_p-p-$约束的对抗性对立扰动,即对投入的修改,约束在大约$\ell_p$标准中。 然而,已经表明,最先进的模型也容易受到其他更自然的干扰,如在数据扩充的机器学习中已经考虑到的纤维变形。 该项目审查了以往在空间稳健方法方面的工作,并提出演进战略,作为零顺序优化算法,以找到每种投入最差的折形变形。 拟议的方法有效地生成了强力模型,并允许引入非参数的对抗性扰动。