以 DNN Weights 方式签署 (Trojan Signatures in DNN Weights) - 专知论文

会员服务 ·

0

Weight · Networking · DNN · Networks · state-of-the-art ·

2021 年 9 月 7 日

Trojan Signatures in DNN Weights

翻译：以 DNN Weights 方式签署

Greg Fields,Mohammad Samragh,Mojan Javaheripi,Farinaz Koushanfar,Tara Javidi

from arxiv, 8 pages, 13 figures

Deep neural networks have been shown to be vulnerable to backdoor, or trojan, attacks where an adversary has embedded a trigger in the network at training time such that the model correctly classifies all standard inputs, but generates a targeted, incorrect classification on any input which contains the trigger. In this paper, we present the first ultra light-weight and highly effective trojan detection method that does not require access to the training/test data, does not involve any expensive computations, and makes no assumptions on the nature of the trojan trigger. Our approach focuses on analysis of the weights of the final, linear layer of the network. We empirically demonstrate several characteristics of these weights that occur frequently in trojaned networks, but not in benign networks. In particular, we show that the distribution of the weights associated with the trojan target class is clearly distinguishable from the weights associated with other classes. Using this, we demonstrate the effectiveness of our proposed detection method against state-of-the-art attacks across a variety of architectures, datasets, and trigger types.

翻译：深神经网络被证明很容易受到后门或特洛伊的攻击,在这种攻击中,敌人在训练时在网络中嵌入触发器,使模型正确分类所有标准输入,但在含有触发器的任何输入中产生有目标的、不正确的分类。在本文中,我们展示了第一种不需要获得训练/测试数据的超轻量和高度有效的天体探测方法,它并不涉及任何昂贵的计算,也没有对天体触发器的性质作出任何假设。我们的方法侧重于分析网络最后线性层的重量。我们从经验上显示了这些重量的几种特征,这些特征经常发生在台式网络中,而不是在良性网络中。我们特别表明,与天体目标类相关的重量的分布情况与其他等级的重量有明显的区别。我们以此展示了我们提议的探测方法对各种结构、数据集和触发型的状态攻击的有效性。

0

相关内容

Weight

【ICML2021】从DNN中解释和解分不同复杂度的特征分量

专知会员服务

25+阅读 · 2021年7月22日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

429+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

121+阅读 · 2019年12月31日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

已删除

将门创投

4+阅读 · 2020年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Adversarial Neuron Pruning Purifies Backdoored Deep Models

Arxiv

0+阅读 · 2021年10月27日

On Success and Simplicity: A Second Look at Transferable Targeted Attacks

Arxiv

0+阅读 · 2021年10月26日

Defensive Tensorization

Arxiv

0+阅读 · 2021年10月26日

Semantic Host-free Trojan Attack

Arxiv

0+阅读 · 2021年10月26日

Backdoor Learning: A Survey

Arxiv

15+阅读 · 2020年10月26日

Deflecting Adversarial Attacks

Deflecting Adversarial Attacks

Arxiv

8+阅读 · 2020年2月18日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Efficient Eligibility Traces for Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年10月23日

Controllable Generative Adversarial Network

Arxiv

5+阅读 · 2018年5月1日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【ICML2021】从DNN中解释和解分不同复杂度的特征分量

专知会员服务

25+阅读 · 2021年7月22日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

429+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

121+阅读 · 2019年12月31日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

已删除

将门创投

4+阅读 · 2020年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

相关论文

Adversarial Neuron Pruning Purifies Backdoored Deep Models

Arxiv

0+阅读 · 2021年10月27日

On Success and Simplicity: A Second Look at Transferable Targeted Attacks

Arxiv

0+阅读 · 2021年10月26日

Defensive Tensorization

Arxiv

0+阅读 · 2021年10月26日

Semantic Host-free Trojan Attack

Arxiv

0+阅读 · 2021年10月26日

Backdoor Learning: A Survey

Arxiv

15+阅读 · 2020年10月26日

Deflecting Adversarial Attacks

Deflecting Adversarial Attacks

Arxiv

8+阅读 · 2020年2月18日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Efficient Eligibility Traces for Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年10月23日

Controllable Generative Adversarial Network

Arxiv

5+阅读 · 2018年5月1日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

微信扫码咨询专知VIP会员