Deep networks for computer vision are not reliable when they encounter adversarial examples. In this paper, we introduce a framework that uses the dense intrinsic constraints in natural images to robustify inference. By introducing constraints at inference time, we can shift the burden of robustness from training to the inference algorithm, thereby allowing the model to adjust dynamically to each individual image's unique and potentially novel characteristics at inference time. Among different constraints, we find that equivariance-based constraints are most effective, because they allow dense constraints in the feature space without overly constraining the representation at a fine-grained level. Our theoretical results validate the importance of having such dense constraints at inference time. Our empirical experiments show that restoring feature equivariance at inference time defends against worst-case adversarial perturbations. The method obtains improved adversarial robustness on four datasets (ImageNet, Cityscapes, PASCAL VOC, and MS-COCO) on image recognition, semantic segmentation, and instance segmentation tasks. Project page is available at equi4robust.cs.columbia.edu.
翻译:在本文中,我们引入了一个框架,利用自然图像中密集的内在限制来强化推理。通过引入推理时间的制约,我们可以将稳健性负担从培训转向推理算法,从而使模型能够动态地适应每个个人图像在推理时间的独特和潜在的新特点。在不同的制约中,我们发现基于均匀性的制约最为有效,因为它们允许在地物空间中存在密集的制约,而不会过度地限制在细微的分化水平上的代表性。我们的理论结果证实了在推理时间有如此密集的限制的重要性。我们的经验实验显示,在推理时间恢复特征的均匀性可以抵御最坏的对立性干扰。这种方法在图像识别、语系分解和实例分解任务方面的四个数据集(ImageNet、Cityscovers、PASAL VOC和MS-COCO)的对抗性强性强度得到了提高。项目页面可在equi4robust.cdolumbia.