Adversarial patch attacks are an emerging security threat for real world deep learning applications. We present Demasked Smoothing, the first approach (up to our knowledge) to certify the robustness of semantic segmentation models against this threat model. Previous work on certifiably defending against patch attacks has mostly focused on image classification task and often required changes in the model architecture and additional training which is undesirable and computationally expensive. In Demasked Smoothing, any segmentation model can be applied without particular training, fine-tuning, or restriction of the architecture. Using different masking strategies, Demasked Smoothing can be applied both for certified detection and certified recovery. In extensive experiments we show that Demasked Smoothing can on average certify 64% of the pixel predictions for a 1% patch in the detection task and 48% against a 0.5% patch for the recovery task on the ADE20K dataset.
翻译:Aversarial 补丁袭击是现实世界深层学习应用中新出现的安全威胁。 我们展示了脱氧气滑动, 这是第一个(根据我们的知识)验证使用这种威胁模型的语义分解模型的稳健性的方法。 以往的可验证的防补补补补补补式袭击工作主要侧重于图像分类任务, 通常需要改变模型架构和额外的培训, 而这种修改不可取且计算上昂贵。 在脱氧气滑动中, 任何分解模式都可以在没有特定培训、 微调或限制架构的情况下应用。 使用不同的遮盖策略, 脱氧气滑动可以同时用于验证检测和认证恢复。 在广泛的实验中, 我们显示脱氧气滑动平均可以验证64%的像素预测值, 用于检测任务中的1%的补丁, 48%用于ADE20K数据集恢复任务的0.5%的补丁。