通过输入过滤,实现有效和强大的神经神经铁道防御 (Towards Effective and Robust Neural Trojan Defenses via Input Filtering) - 专知论文

会员服务 ·

0

稳健性 · Extensibility · state-of-the-art · 向量空间 · SimPLe ·

2022 年 2 月 24 日

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

翻译：通过输入过滤,实现有效和强大的神经神经铁道防御

Kien Do,Haripriya Harikumar,Hung Le,Dung Nguyen,Truyen Tran,Santu Rana,Dang Nguyen,Willy Susilo,Svetha Venkatesh

Trojan attacks on deep neural networks are both dangerous and surreptitious. Over the past few years, Trojan attacks have advanced from using only a simple trigger and targeting only one class to using many sophisticated triggers and targeting multiple classes. However, Trojan defenses have not caught up with this development. Most defense methods still make out-of-date assumptions about Trojan triggers and target classes, thus, can be easily circumvented by modern Trojan attacks. In this paper, we advocate general defenses that are effective and robust against various Trojan attacks and propose two novel "filtering" defenses with these characteristics called Variational Input Filtering (VIF) and Adversarial Input Filtering (AIF). VIF and AIF leverage variational inference and adversarial training respectively to purify all potential Trojan triggers in the input at run time without making any assumption about their numbers and forms. We further extend "filtering" to "filtering-then-contrasting" - a new defense mechanism that helps avoid the drop in classification accuracy on clean data caused by filtering. Extensive experimental results show that our proposed defenses significantly outperform 4 well-known defenses in mitigating 5 different Trojan attacks including the two state-of-the-art which defeat many strong defenses.

翻译：对深层神经网络的Trojan攻击既危险又神秘。在过去的几年里,Trojan攻击从只使用简单的触发器,只针对一个等级,而是使用许多尖端的触发器和多等级。然而,Trojan防御没有赶上这一发展。大多数防御方法仍然对Trojan触发器和目标类别作出过时的假设,因此,可以很容易地被现代Trojan攻击所绕过。在这份文件中,我们提倡针对各种Trojan攻击的有效和强力的一般防御,并提出了两种新型的“过滤”防御装置,其特征是“挥发式输入过滤器”和反向输入过滤器。VIF和AIF分别利用变异推力和对抗性训练来净化运行中所有潜在的Trojan触发器,而没有对其数量和形式作任何假设。我们进一步扩展“过滤”到“过滤式”——一种新的防御机制,它有助于避免过滤导致清洁数据的分类准确性下降。广泛的实验结果显示,在两次防御中,包括两起不同式的防御中,这已经明显地表明,我们的防御措施已经大大超越了。

0

相关内容

稳健性

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

面向遮挡条件下的人脸识别方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

基于深度学习的复杂图像显著物体检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度线索的多假设变分场景流估计研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于互动生态补偿机制的不确定性水资源管理方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

光基因调控脊髓损伤小鼠步行CPG研究

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Automatic Hardware Trojan Insertion using Machine Learning

Arxiv

1+阅读 · 2022年4月18日

UNBUS: Uncertainty-aware Deep Botnet Detection System in Presence of Perturbed Samples

Arxiv

1+阅读 · 2022年4月18日

Towards Robust Neural Networks via Orthogonal Diversity

Towards Robust Neural Networks via Orthogonal Diversity

Arxiv

0+阅读 · 2022年4月18日

Learning Compositional Representations for Effective Low-Shot Generalization

Arxiv

0+阅读 · 2022年4月17日

FocalClick: Towards Practical Interactive Image Segmentation

Arxiv

0+阅读 · 2022年4月17日

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年4月17日

Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations

Arxiv

0+阅读 · 2022年4月16日

Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection

Arxiv

0+阅读 · 2022年4月15日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Automatic Hardware Trojan Insertion using Machine Learning

Arxiv

1+阅读 · 2022年4月18日

UNBUS: Uncertainty-aware Deep Botnet Detection System in Presence of Perturbed Samples

Arxiv

1+阅读 · 2022年4月18日

Towards Robust Neural Networks via Orthogonal Diversity

Towards Robust Neural Networks via Orthogonal Diversity

Arxiv

0+阅读 · 2022年4月18日

Learning Compositional Representations for Effective Low-Shot Generalization

Arxiv

0+阅读 · 2022年4月17日

FocalClick: Towards Practical Interactive Image Segmentation

Arxiv

0+阅读 · 2022年4月17日

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年4月17日

Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations

Arxiv

0+阅读 · 2022年4月16日

Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection

Arxiv

0+阅读 · 2022年4月15日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

相关基金

面向遮挡条件下的人脸识别方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

基于深度学习的复杂图像显著物体检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度线索的多假设变分场景流估计研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于互动生态补偿机制的不确定性水资源管理方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

光基因调控脊髓损伤小鼠步行CPG研究

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员