关于Trojaned BERTs的注意力异常的研究 (A Study of the Attention Abnormality in Trojaned BERTs) - 专知论文

会员服务 ·

0

注意力机制 · MoDELS · 知识 (knowledge) · 词元分析器 · 论文 ·

2022 年 5 月 13 日

A Study of the Attention Abnormality in Trojaned BERTs

翻译：关于Trojaned BERTs的注意力异常的研究

Weimin Lyu,Songzhu Zheng,Tengfei Ma,Chao Chen

from arxiv, Accepted to NAACL-HTL 2022

Trojan attacks raise serious security concerns. In this paper, we investigate the underlying mechanism of Trojaned BERT models. We observe the attention focus drifting behavior of Trojaned models, i.e., when encountering an poisoned input, the trigger token hijacks the attention focus regardless of the context. We provide a thorough qualitative and quantitative analysis of this phenomenon, revealing insights into the Trojan mechanism. Based on the observation, we propose an attention-based Trojan detector to distinguish Trojaned models from clean ones. To the best of our knowledge, this is the first paper to analyze the Trojan mechanism and to develop a Trojan detector based on the transformer's attention.

翻译：Trojan攻击引起了严重的安全关切。在本文中, 我们调查了 Trojaned BERT 模型的基本机制。我们观察了 Trojaned 模型的注意焦点漂移行为, 也就是说, 当遇到有毒输入时, 触发符号会劫持注意力焦点, 不论背景如何。我们对此现象进行了彻底的定性和定量分析, 揭示了对Trojan 机制的洞察力。根据观察, 我们建议使用关注基的Trojan 探测器来区分Trojan 模型和清洁模型。据我们所知, 这是第一份文件, 分析Trojan 机制, 并根据变压器的注意开发一个Trojan 探测器。

0

相关内容

注意力机制

注意力机制

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

62+阅读 · 2021年7月30日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于孪生随机形核的镁合金细观塑性本构与损伤行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

FeSe铁基超导薄膜的扫描隧道显微学研究

国家自然科学基金

0+阅读 · 2014年12月31日

空蚀对镍基Inconel600合金钝化膜电化学性能影响

国家自然科学基金

0+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

控制力矩陀螺的高频微振动特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

图的谱唯一及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

钙敏感性IRE1酶"门控"作用对肝癌细胞自噬生存/死亡转归的影响及药物干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于微光学腔的中性单原子量子态的操控

国家自然科学基金

0+阅读 · 2009年12月31日

ExpansionNet: exploring the sequence length bottleneck in the Transformer for Image Captioning

Arxiv

0+阅读 · 2022年7月7日

Attention Round for Post-Training Quantization

Arxiv

0+阅读 · 2022年7月7日

The Role of Complex NLP in Transformers for Text Ranking?

Arxiv

0+阅读 · 2022年7月6日

Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation

Arxiv

0+阅读 · 2022年7月5日

GazBy: Gaze-Based BERT Model to Incorporate Human Attention in Neural Information Retrieval

Arxiv

0+阅读 · 2022年7月4日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Self-Attention with Relative Position Representations

Arxiv

14+阅读 · 2018年3月6日

VIP会员

文章信息

相关主题

注意力机制

知识 (knowledge)

词元分析器

相关VIP内容

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

62+阅读 · 2021年7月30日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

ExpansionNet: exploring the sequence length bottleneck in the Transformer for Image Captioning

Arxiv

0+阅读 · 2022年7月7日

Attention Round for Post-Training Quantization

Arxiv

0+阅读 · 2022年7月7日

The Role of Complex NLP in Transformers for Text Ranking?

Arxiv

0+阅读 · 2022年7月6日

Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation

Arxiv

0+阅读 · 2022年7月5日

GazBy: Gaze-Based BERT Model to Incorporate Human Attention in Neural Information Retrieval

Arxiv

0+阅读 · 2022年7月4日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Self-Attention with Relative Position Representations

Arxiv

14+阅读 · 2018年3月6日

相关基金

基于孪生随机形核的镁合金细观塑性本构与损伤行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

FeSe铁基超导薄膜的扫描隧道显微学研究

国家自然科学基金

0+阅读 · 2014年12月31日

空蚀对镍基Inconel600合金钝化膜电化学性能影响

国家自然科学基金

0+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

控制力矩陀螺的高频微振动特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

图的谱唯一及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

钙敏感性IRE1酶"门控"作用对肝癌细胞自噬生存/死亡转归的影响及药物干预研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于微光学腔的中性单原子量子态的操控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员