检测推理阶段中的后门基于崩溃稳健性一致性 (Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency) - 专知论文

会员服务 ·

0

稳健性 · 稳健 · 一致 · 后门攻击 · 样本 ·

2023 年 3 月 27 日

Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency

翻译：检测推理阶段中的后门基于崩溃稳健性一致性

Xiaogeng Liu,Minghui Li,Haoyu Wang,Shengshan Hu,Dengpan Ye,Hai Jin,Libing Wu,Chaowei Xiao

from arxiv, Accepted by CVPR2023. Code is available at https://github.com/CGCL-codes/TeCo

Deep neural networks are proven to be vulnerable to backdoor attacks. Detecting the trigger samples during the inference stage, i.e., the test-time trigger sample detection, can prevent the backdoor from being triggered. However, existing detection methods often require the defenders to have high accessibility to victim models, extra clean data, or knowledge about the appearance of backdoor triggers, limiting their practicality. In this paper, we propose the test-time corruption robustness consistency evaluation (TeCo), a novel test-time trigger sample detection method that only needs the hard-label outputs of the victim models without any extra information. Our journey begins with the intriguing observation that the backdoor-infected models have similar performance across different image corruptions for the clean images, but perform discrepantly for the trigger samples. Based on this phenomenon, we design TeCo to evaluate test-time robustness consistency by calculating the deviation of severity that leads to predictions' transition across different corruptions. Extensive experiments demonstrate that compared with state-of-the-art defenses, which even require either certain information about the trigger types or accessibility of clean data, TeCo outperforms them on different backdoor attacks, datasets, and model architectures, enjoying a higher AUROC by 10% and 5 times of stability.

翻译：深度神经网络已被证明容易受到后门攻击。检测触发样本，即测试时间的触发样本检测，可以防止触发后门。然而，现有的检测方法通常需要防御者高度接触受害者模型、额外的干净数据或有关后门触发器外观的知识，从而限制了它们的实用性。本文提出了一种新的测试时触发样本检测方法，即测试时崩溃稳健性一致性评估（TeCo）。该方法仅需要受害者模型的硬标签输出，不需要任何额外信息。本文的研究始于一个有趣的观察，即感染后门的模型在干净图像的不同破坏形式下表现相似，但在触发样本上存在差异。基于这一现象，我们设计了TeCo，通过计算不同破坏形式引起预测转变的严重性偏差来评估测试时稳健性一致性。大量实验表明，与最先进的防御方式相比，TeCo在不同的后门攻击、数据集和模型架构上具有更好的性能，AUROC高出10%且稳定性提高了5倍。

1

相关内容

稳健性

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

专知会员服务

12+阅读 · 2022年3月24日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

30+阅读 · 2020年11月21日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

PaperWeekly

0+阅读 · 2022年11月26日

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

非平衡数据集 focal loss 多类分类

非平衡数据集 focal loss 多类分类

AI研习社

33+阅读 · 2019年4月23日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

基于高维大规模数据的集成建模方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为踪迹的网络蠕虫模型和检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

11C-PD153035 PET/CT筛选非小细胞肺癌EGFR突变和监测EGFR-TKIs疗效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于二型模糊逻辑的多核程序数据竞争与死锁检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

市场竞争、不确定性与企业非效率投资- - 基于非效率投资确定方法的改进

国家自然科学基金

0+阅读 · 2012年12月31日

芯片硬件木马安全检测方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

GSURE-Based Diffusion Model Training with Corrupted Data

Arxiv

0+阅读 · 2023年5月22日

Enhanced Meta Label Correction for Coping with Label Corruption

Arxiv

0+阅读 · 2023年5月22日

Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

Quantifying the effect of X-ray scattering for data generation in real-time defect detection

Arxiv

0+阅读 · 2023年5月22日

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

Arxiv

0+阅读 · 2023年5月19日

CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

Arxiv

0+阅读 · 2023年5月18日

Measurement Based Evaluation and Mitigation of Flood Attacks on a LAN Test-Bed

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Do Feature Attribution Methods Correctly Attribute Features?

Arxiv

15+阅读 · 2021年12月15日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

【ACL2022】解释生成的多尺度分布深度变分自编码器, Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation

专知会员服务

12+阅读 · 2022年3月24日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

30+阅读 · 2020年11月21日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

【AAAI2020】Context-Transformer:上下文转换器:解决对象混淆的小样本检测，Context-Transformer: Tackling Object Confusion for Few-Shot Detection

专知会员服务

51+阅读 · 2020年3月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

KDD 2022 | 中科院计算所提出无监督高鲁棒性图结构学习框架—STABLE

PaperWeekly

0+阅读 · 2022年11月26日

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

非平衡数据集 focal loss 多类分类

非平衡数据集 focal loss 多类分类

AI研习社

33+阅读 · 2019年4月23日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

【泡泡一分钟】基于李群的无损卡尔曼滤波器在视觉里程计上的应用

泡泡机器人SLAM

11+阅读 · 2018年12月17日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

相关论文

GSURE-Based Diffusion Model Training with Corrupted Data

Arxiv

0+阅读 · 2023年5月22日

Enhanced Meta Label Correction for Coping with Label Corruption

Arxiv

0+阅读 · 2023年5月22日

Uncertainty-based Detection of Adversarial Attacks in Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

Quantifying the effect of X-ray scattering for data generation in real-time defect detection

Arxiv

0+阅读 · 2023年5月22日

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

Arxiv

0+阅读 · 2023年5月19日

CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

Arxiv

0+阅读 · 2023年5月18日

Measurement Based Evaluation and Mitigation of Flood Attacks on a LAN Test-Bed

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Do Feature Attribution Methods Correctly Attribute Features?

Arxiv

15+阅读 · 2021年12月15日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

相关基金

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

基于高维大规模数据的集成建模方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为踪迹的网络蠕虫模型和检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

11C-PD153035 PET/CT筛选非小细胞肺癌EGFR突变和监测EGFR-TKIs疗效的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于二型模糊逻辑的多核程序数据竞争与死锁检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

市场竞争、不确定性与企业非效率投资- - 基于非效率投资确定方法的改进

国家自然科学基金

0+阅读 · 2012年12月31日

芯片硬件木马安全检测方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员