Any classifier can be "smoothed out" under Gaussian noise to build a new classifier that is provably robust to $\ell_2$-adversarial perturbations, viz., by averaging its predictions over the noise via randomized smoothing. Under the smoothed classifiers, the fundamental trade-off between accuracy and (adversarial) robustness has been well evidenced in the literature: i.e., increasing the robustness of a classifier for an input can be at the expense of decreased accuracy for some other inputs. In this paper, we propose a simple training method leveraging this trade-off to obtain robust smoothed classifiers, in particular, through a sample-wise control of robustness over the training samples. We make this control feasible by using "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Specifically, we differentiate the training objective depending on this proxy to filter out samples that are unlikely to benefit from the worst-case (adversarial) objective. Our experiments show that the proposed method, despite its simplicity, consistently exhibits improved certified robustness upon state-of-the-art training methods. Somewhat surprisingly, we find these improvements persist even for other notions of robustness, e.g., to various types of common corruptions.
翻译:任何分类器都可以在高斯的噪音下“ 吸附”, 以构建一个新的分类器, 以通过随机滑动来平均对噪音的预测。 在平滑的分类器下, 精确度和( 对抗)稳健度之间的基本权衡关系在文献中得到了很好的证明 : 提高分类器对投入的稳健性, 可能会降低其他投入的准确性。 在本文中, 我们提出一个简单的培训方法, 利用这一交换器获得稳健的平滑分类器, 特别是通过对培训样品的稳健性进行抽样的抽样控制。 我们通过使用“ 高斯噪音下的准确性” 来使用“ 稳健的精确性”, 来方便地估算对投入的稳健性进行对比。 具体地说, 我们根据这种代用来筛选样本的培训目标, 可能牺牲一些最坏的( 对抗性) 目标。 我们的实验表明, 拟议的方法, 尽管很简单, 持续地展示了对强健健健健性进行认证的方法。