The threat of adversarial examples has motivated work on training certifiably robust neural networks to facilitate efficient verification of local robustness at inference time. We formalize a notion of global robustness, which captures the operational properties of on-line local robustness certification while yielding a natural learning objective for robust training. We show that widely-used architectures can be easily adapted to this objective by incorporating efficient global Lipschitz bounds into the network, yielding certifiably-robust models by construction that achieve state-of-the-art verifiable accuracy. Notably, this approach requires significantly less time and memory than recent certifiable training methods, and leads to negligible costs when certifying points on-line; for example, our evaluation shows that it is possible to train a large robust Tiny-Imagenet model in a matter of hours. Our models effectively leverage inexpensive global Lipschitz bounds for real-time certification, despite prior suggestions that tighter local bounds are needed for good performance; we posit this is possible because our models are specifically trained to achieve tighter global bounds. Namely, we prove that the maximum achievable verifiable accuracy for a given dataset is not improved by using a local bound.
翻译:对抗性实例的威胁促使人们着手开展可以证实的强大神经网络培训工作,以便利在推论时间对当地稳健性进行有效的核查。我们正式确定了全球稳健性概念,这一概念包括了在线当地稳健性认证的操作性质,同时产生了一个自然学习的强健培训目标。我们表明,广泛使用的建筑可以通过将高效的全球利普西茨界限纳入网络,从而容易地适应这一目标,通过建筑实现最先进的可核查准确性,从而产生可证实的有机紫色模型。值得注意的是,这一方法需要的时间和记忆大大少于最近的可验证培训方法,并导致在网上验证点时成本微不足道;例如,我们的评估表明,有可能在数小时内培训一个大型强健健的Tiny-Imagenet模型。我们的模型有效地利用廉价的全球利普西茨约束进行实时认证,尽管先前曾建议,为良好业绩需要更紧密的本地界限;我们假设这是可能的;因为我们的模型是经过专门训练,可以达到更紧密的全球界限。例如,我们证明,对当地数据设定的可实现的最高可核实性精确性,而没有通过改进当地数据集约。