The threat of adversarial examples has motivated work on training certifiably robust neural networks, to facilitate efficient verification of local robustness at inference time. We formalize a notion of global robustness, which captures the operational properties of on-line local robustness certification while yielding a natural learning objective for robust training. We show that widely-used architectures can be easily adapted to this objective by incorporating efficient global Lipschitz bounds into the network, yielding certifiably-robust models by construction that achieve state-of-the-art verifiable and clean accuracy. Notably, this approach requires significantly less time and memory than recent certifiable training methods, and leads to negligible costs when certifying points on-line; for example, our evaluation shows that it is possible to train a large tiny-imagenet model in a matter of hours. We posit that this is possible using inexpensive global bounds -- despite prior suggestions that tighter local bounds are needed for good performance -- because these models are trained to achieve tighter global bounds. Namely, we prove that the maximum achievable verifiable accuracy for a given dataset is not improved by using a local bound.
翻译:对抗性实例的威胁促使人们着手开展可以证实的强大神经网络培训工作,以便利在推论时间对当地稳健性进行有效的核查。我们正式确定了全球稳健性概念,它抓住了在线稳健性认证的操作性质,同时为强健性培训制定了自然学习目标。我们表明,广泛使用的建筑可以通过将高效的全球利普西茨界限纳入网络,从而很容易地适应这一目标,通过实现最先进、可核查和清洁准确性的建筑,产生可证实的强压模型。值得注意的是,这一方法需要的时间和记忆大大少于最近的可验证培训方法,在网上验证点时导致微不足道的成本;例如,我们的评估表明,有可能在数小时内培训一个大型微图像网络模型。我们假设,这有可能使用廉价的全球界限 -- -- 尽管先前曾建议,为良好业绩需要更严格的地方界限 -- -- 因为这些模型经过培训,可以实现更严格的全球界限。也就是说,我们证明,使用本地约束的方法无法改进特定数据集可实现的最高可核实性准确性。