经证明的文字对抗随机攻击的威力[MASK] (Certified Robustness to Text Adversarial Attacks by Randomized [MASK]) - 专知论文

会员服务 ·

0

稳健性 · 掩码 · INFORMS · 成比例 · 平滑 ·

2021 年 7 月 25 日

Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

翻译：经证明的文字对抗随机攻击的威力[MASK]

Jiehang Zeng,Xiaoqing Zheng,Jianhan Xu,Linyang Li,Liping Yuan,Xuanjing Huang

from arxiv, Under Review for TOPS

Recently, few certified defense methods have been developed to provably guarantee the robustness of a text classifier to adversarial synonym substitutions. However, all existing certified defense methods assume that the defenders are informed of how the adversaries generate synonyms, which is not a realistic scenario. In this paper, we propose a certifiably robust defense method by randomly masking a certain proportion of the words in an input text, in which the above unrealistic assumption is no longer necessary. The proposed method can defend against not only word substitution-based attacks, but also character-level perturbations. We can certify the classifications of over 50% texts to be robust to any perturbation of 5 words on AGNEWS, and 2 words on SST2 dataset. The experimental results show that our randomized smoothing method significantly outperforms recently proposed defense methods across multiple datasets.

翻译：最近,为保证文本分类者对对抗性同义词替代的可靠性而开发的经认证的国防方法很少。但是,所有现有的经认证的国防方法都假定捍卫者知道对手是如何产生同义词的,这是不现实的情景。在本文中,我们建议了一种可以验证的强有力的防御方法,随机掩蔽输入文本中某些部分的词,在其中不再需要上述不现实的假设。拟议的方法不仅可以防御基于文字的替换攻击,还可以防御性水平的干扰。我们可以证明超过50%的文本的分类对AGNEWS的5个字和SST2数据集的2个字的任何干扰都是可靠的。实验结果显示,我们随机的平滑方法大大优于最近提出的跨越多个数据集的防御方法。

0

相关内容

稳健性

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

专知会员服务

24+阅读 · 2020年2月22日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

专知会员服务

6+阅读 · 2019年11月24日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

An Orthogonal Classifier for Improving the Adversarial Robustness of Neural Networks

Arxiv

1+阅读 · 2021年9月24日

A black-box adversarial attack for poisoning clustering

Arxiv

0+阅读 · 2021年9月23日

CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

Arxiv

0+阅读 · 2021年9月22日

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

Arxiv

0+阅读 · 2021年9月21日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

Generate To Adapt: Aligning Domains using Generative Adversarial Networks

Arxiv

4+阅读 · 2018年4月1日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

GeoffreyHinton-ICML2020投稿论文-偏转对抗攻击 Deflecting Adversarial Attacks

专知会员服务

24+阅读 · 2020年2月22日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding ImageClassification Decisions and Improved NeuralNetwork Robustness）

专知会员服务

6+阅读 · 2019年11月24日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

An Orthogonal Classifier for Improving the Adversarial Robustness of Neural Networks

Arxiv

1+阅读 · 2021年9月24日

A black-box adversarial attack for poisoning clustering

Arxiv

0+阅读 · 2021年9月23日

CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

Arxiv

0+阅读 · 2021年9月22日

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

Arxiv

0+阅读 · 2021年9月21日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

Generate To Adapt: Aligning Domains using Generative Adversarial Networks

Arxiv

4+阅读 · 2018年4月1日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员