Training deep neural network classifiers that are certifiably robust against adversarial attacks is critical to ensuring the security and reliability of AI-controlled systems. Although numerous state-of-the-art certified training methods have been developed, they are computationally expensive and scale poorly with respect to both dataset and network complexity. Widespread usage of certified training is further hindered by the fact that periodic retraining is necessary to incorporate new data and network improvements. In this paper, we propose Certified Robustness Transfer (CRT), a general-purpose framework for reducing the computational overhead of any certifiably robust training method through knowledge transfer. Given a robust teacher, our framework uses a novel training loss to transfer the teacher's robustness to the student. We provide theoretical and empirical validation of CRT. Our experiments on CIFAR-10 show that CRT speeds up certified robustness training by $8 \times$ on average across three different architecture generations while achieving comparable robustness to state-of-the-art methods. We also show that CRT can scale to large-scale datasets like ImageNet.
翻译:虽然已经开发出许多最先进的经认证的培训方法,但是在数据集和网络复杂性方面,这些方法的计算成本和规模都很低。由于需要定期再培训以纳入新的数据和网络改进,广泛使用经认证的培训进一步受到阻碍。在本文件中,我们提出了认证强力传输(CRT),这是通过知识转让减少任何经认证强力培训方法的计算间接费用的通用框架。我们的框架使用新的培训损失,将教师的强力传递给学生。我们提供了CRT的理论和经验验证。我们在CIFAR-10的实验表明,CRT在三个不同的建筑世代中平均加快了经认证的强力培训,平均增加了8美元,同时实现了与最新方法相当的强力。我们还表明,CRT可以向像图像网络这样的大规模数据集扩展。