We demonstrate, theoretically and empirically, that adversarial robustness can significantly benefit from semisupervised learning. Theoretically, we revisit the simple Gaussian model of Schmidt et al. that shows a sample complexity gap between standard and robust classification. We prove that unlabeled data bridges this gap: a simple semisupervised learning procedure (self-training) achieves high robust accuracy using the same number of labels required for achieving high standard accuracy. Empirically, we augment CIFAR-10 with 500K unlabeled images sourced from 80 Million Tiny Images and use robust self-training to outperform state-of-the-art robust accuracies by over 5 points in (i) $\ell_\infty$ robustness against several strong attacks via adversarial training and (ii) certified $\ell_2$ and $\ell_\infty$ robustness via randomized smoothing. On SVHN, adding the dataset's own extra training set with the labels removed provides gains of 4 to 10 points, within 1 point of the gain from using the extra labels.
翻译:我们从理论上和从经验上证明,对抗性强健性可以从半监督学习中大大获益。理论上,我们重新审视Schmid等人的简单高盛模式,该模式显示了标准与稳健分类之间的抽样复杂性差距。我们证明,未贴标签的数据弥补了这一差距:简单的半监督学习程序(自我培训)使用达到高标准准确性所需的相同数量标签,实现了高度稳健的准确性。同时,我们利用来自8 000万微小图像的500K无标签图像来增强CIFAR-10,并使用强有力的自我培训,以超过最先进的高科技强健能力,在(一) $\ell\incty$ 稳健度中超过5个百分点,在(一) $\ell\infty$对通过对抗性训练的多次强攻,以及(二) 通过随机滑动,认证$\2美元和$\ell\ffty $的稳健性。关于SVHN,我们添加了数据集本身的附加培训,并删除了8 000万张小图象,在使用额外标签获得的收益的1点内有4至10分。