强力NLP: 保护NLP防止后门攻击模式的技术 (RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks) - 专知论文

会员服务 ·

0

MoDELS · NLP · Extensibility · HTTPS · Networking ·

2023 年 2 月 18 日

RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks

翻译：强力NLP: 保护NLP防止后门攻击模式的技术

As machine learning (ML) systems are being increasingly employed in the real world to handle sensitive tasks and make decisions in various fields, the security and privacy of those models have also become increasingly critical. In particular, Deep Neural Networks (DNN) have been shown to be vulnerable to backdoor attacks whereby adversaries have access to the training data and the opportunity to manipulate such data by inserting carefully developed samples into the training dataset. Although the NLP community has produced several studies on generating backdoor attacks proving the vulnerable state of language modes, to the best of our knowledge, there does not exist any work to combat such attacks. To bridge this gap, we present RobustEncoder: a novel clustering-based technique for detecting and removing backdoor attacks in the text domain. Extensive empirical results demonstrate the effectiveness of our technique in detecting and removing backdoor triggers. Our code is available at https://github.com/marwanomar1/Backdoor-Learning-for-NLP

翻译：随着在现实世界越来越多地使用机器学习(ML)系统处理敏感任务和在各个领域作出决定,这些模型的安全和隐私也变得越来越重要,特别是深神经网络(DNN)被证明很容易受到幕后攻击,因为对手获得培训数据,有机会通过在培训数据集中插入精心开发的样本来操纵这些数据。虽然国家语言方案社区已经就后门攻击的产生进行了几项研究,以证明语言模式的脆弱状态,但据我们所知,没有任何工作可以打击这种攻击。为弥补这一差距,我们介绍了RobustEncoder:一种基于集群的新型技术,用以探测和消除文本领域的后门攻击。广泛的实证结果表明,我们在探测和清除后门触发器方面的技术是有效的。我们的代码可在https://github.com/marwanomar1/Backdoor-Learning-for-NLP上查阅。我们可查到https://github.com/maromar1/Backdoor-LP。

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

新型细胞因子PGRN抑制A型流感病毒增殖的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

山葡萄雄株性别CKX基因家族分析与VaCKX的性别转换功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

无穷维随机微分系统的适定性与渐近动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

调节性T细胞（Tregs）参与非结核分枝杆菌慢性感染分子免疫调节机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝脏新型天然淋巴细胞亚群的鉴定及其在慢性乙肝进展中的调节机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BAG3在慢性淋巴细胞白血病凋亡及迁移中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

捻转血矛线虫感染诱导宿主T淋巴细胞基因表达谱的差异分析

国家自然科学基金

0+阅读 · 2009年12月31日

Multi-step Jailbreaking Privacy Attacks on ChatGPT

Arxiv

1+阅读 · 2023年4月11日

Defense-Prefix for Preventing Typographic Attacks on CLIP

Arxiv

0+阅读 · 2023年4月10日

Recover Triggered States: Protect Model Against Backdoor Attack in Reinforcement Learning

Arxiv

0+阅读 · 2023年4月10日

SoK: Decentralized Finance (DeFi) Attacks

Arxiv

0+阅读 · 2023年4月7日

Backdoor Cleansing with Unlabeled Data

Arxiv

0+阅读 · 2023年4月6日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Advances in adversarial attacks and defenses in computer vision: A survey

Arxiv

22+阅读 · 2021年9月2日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Multi-step Jailbreaking Privacy Attacks on ChatGPT

Arxiv

1+阅读 · 2023年4月11日

Defense-Prefix for Preventing Typographic Attacks on CLIP

Arxiv

0+阅读 · 2023年4月10日

Recover Triggered States: Protect Model Against Backdoor Attack in Reinforcement Learning

Arxiv

0+阅读 · 2023年4月10日

SoK: Decentralized Finance (DeFi) Attacks

Arxiv

0+阅读 · 2023年4月7日

Backdoor Cleansing with Unlabeled Data

Arxiv

0+阅读 · 2023年4月6日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Advances in adversarial attacks and defenses in computer vision: A survey

Arxiv

22+阅读 · 2021年9月2日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

相关基金

新型细胞因子PGRN抑制A型流感病毒增殖的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

山葡萄雄株性别CKX基因家族分析与VaCKX的性别转换功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

无穷维随机微分系统的适定性与渐近动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

调节性T细胞（Tregs）参与非结核分枝杆菌慢性感染分子免疫调节机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝脏新型天然淋巴细胞亚群的鉴定及其在慢性乙肝进展中的调节机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BAG3在慢性淋巴细胞白血病凋亡及迁移中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

捻转血矛线虫感染诱导宿主T淋巴细胞基因表达谱的差异分析

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员