反versarial培训数据分析:问题数据的弊端 (Data Profiling for Adversarial Training: On the Ruin of Problematic Data) - 专知论文

会员服务 ·

0

稳健性 · 过拟合 · 共因 · 可辨认的 · Better ·

2021 年 2 月 15 日

Data Profiling for Adversarial Training: On the Ruin of Problematic Data

翻译：反versarial培训数据分析:问题数据的弊端

Chengyu Dong,Liyuan Liu,Jingbo Shang

Multiple intriguing problems hover in adversarial training, including robustness-accuracy trade-off, robust overfitting, and gradient masking, posing great challenges to both reliable evaluation and practical deployment. Here, we show that these problems share one common cause -- low quality samples in the dataset. We first identify an intrinsic property of the data called problematic score and then design controlled experiments to investigate its connections with these problems. Specifically, we find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated; and robustness-accuracy trade-off is more prominent for a dataset containing highly problematic data. These observations not only verify our intuition about data quality but also open new opportunities to advance adversarial training. Remarkably, simply removing problematic data from adversarial training, while making the training set smaller, yields better robustness consistently with different adversary settings, training methods, and neural architectures.

翻译：在对抗性培训中,存在许多令人感兴趣的问题,包括稳健的准确性交易、稳健的过度装配和梯度掩码,这对可靠的评估和实际部署都构成巨大挑战。在这里,我们证明这些问题有一个共同的原因,即数据集中的低质量样本。我们首先确定数据中被称为问题评分的内在属性,然后设计受控的实验来调查数据与这些问题的联系。具体地说,我们发现,当有问题的数据被删除时,稳健的过度装配和梯度掩码可以大大缓解;对于包含高度问题数据的数据集来说,稳健的准确性交易更为突出。这些观察不仅验证了我们对数据质量的直觉,而且还为推进对抗性培训开辟了新的机会。显而易见的是,仅仅从对抗性培训中去除了有问题的数据,同时缩小了培训组的体积,与不同的对手环境、培训方法和神经结构保持了更稳健的状态。

1

相关内容

稳健性

【斯坦福CS224W】知识图谱推理，84页ppt

【斯坦福CS224W】知识图谱推理，84页ppt

专知会员服务

121+阅读 · 2021年2月19日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

专知会员服务

58+阅读 · 2020年6月17日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

An Empirical Study of the Effects of Sample-Mixing Methods for Efficient Training of Generative Adversarial Networks

Arxiv

0+阅读 · 2021年4月8日

Regularizing Generative Adversarial Networks under Limited Data

Arxiv

0+阅读 · 2021年4月7日

Universal Adversarial Training with Class-Wise Perturbations

Arxiv

0+阅读 · 2021年4月7日

On the Generalization Properties of Adversarial Training

Arxiv

1+阅读 · 2021年4月6日

Deflecting Adversarial Attacks

Deflecting Adversarial Attacks

Arxiv

8+阅读 · 2020年2月18日

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月10日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Activation Maximization Generative Adversarial Nets

Arxiv

5+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

【斯坦福CS224W】知识图谱推理，84页ppt

【斯坦福CS224W】知识图谱推理，84页ppt

专知会员服务

121+阅读 · 2021年2月19日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

专知会员服务

58+阅读 · 2020年6月17日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

An Empirical Study of the Effects of Sample-Mixing Methods for Efficient Training of Generative Adversarial Networks

Arxiv

0+阅读 · 2021年4月8日

Regularizing Generative Adversarial Networks under Limited Data

Arxiv

0+阅读 · 2021年4月7日

Universal Adversarial Training with Class-Wise Perturbations

Arxiv

0+阅读 · 2021年4月7日

On the Generalization Properties of Adversarial Training

Arxiv

1+阅读 · 2021年4月6日

Deflecting Adversarial Attacks

Deflecting Adversarial Attacks

Arxiv

8+阅读 · 2020年2月18日

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Data Augmentation of Room Classifiers using Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月10日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Activation Maximization Generative Adversarial Nets

Arxiv

5+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员