自组防护：训练检查点是良好的数据保护器 (Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors) - 专知论文

会员服务 ·

0

检查点 · 扰动 · DNN · 示例 · 梯度 ·

2023 年 4 月 12 日

Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors

翻译：自组防护：训练检查点是良好的数据保护器

Sizhe Chen,Geng Yuan,Xinwen Cheng,Yifan Gong,Minghai Qin,Yanzhi Wang,Xiaolin Huang

from arxiv, ICLR 2023

As data becomes increasingly vital, a company would be very cautious about releasing data, because the competitors could use it to train high-performance models, thereby posing a tremendous threat to the company's commercial competence. To prevent training good models on the data, we could add imperceptible perturbations to it. Since such perturbations aim at hurting the entire training process, they should reflect the vulnerability of DNN training, rather than that of a single model. Based on this new idea, we seek perturbed examples that are always unrecognized (never correctly classified) in training. In this paper, we uncover them by model checkpoints' gradients, forming the proposed self-ensemble protection (SEP), which is very effective because (1) learning on examples ignored during normal training tends to yield DNNs ignoring normal examples; (2) checkpoints' cross-model gradients are close to orthogonal, meaning that they are as diverse as DNNs with different architectures. That is, our amazing performance of ensemble only requires the computation of training one model. By extensive experiments with 9 baselines on 3 datasets and 5 architectures, SEP is verified to be a new state-of-the-art, e.g., our small $\ell_\infty=2/255$ perturbations reduce the accuracy of a CIFAR-10 ResNet18 from 94.56% to 14.68%, compared to 41.35% by the best-known method. Code is available at https://github.com/Sizhe-Chen/SEP.

翻译：随着数据变得越来越重要，公司会非常谨慎地发布数据，因为竞争对手可能会使用它来训练高性能模型，从而对公司的商业竞争力构成巨大威胁。为了防止在数据上训练好的模型，我们可以在数据中添加无法感知的扰动。由于这些扰动旨在破坏整个训练过程，因此它们应该反映DNN训练的脆弱性，而不是单个模型的脆弱性。基于这个新想法，我们寻找在训练中永远不被识别（从未正确分类）的扰动示例。在本文中，我们通过模型检查点的梯度来发现它们，形成所提出的自组防护（SEP），这是非常有效的，因为（1）在通常训练中忽略的示例上学习往往会产生忽略正常示例的DNN；（2）检查点的跨模型梯度接近正交，意味着它们像具有不同体系结构的DNN一样多样化。也就是说，我们惊人的集成表现只需要计算训练一个模型。在三个数据集和五个体系结构上进行的广泛实验验证了SEP是一种新的最先进技术，例如，我们小的$\ell_\infty=2/255$扰动将CIFAR-10 ResNet18的准确度从94.56%降至14.68%，而最佳已知方法只能达到41.35%。代码可在 https://github.com/Sizhe-Chen/SEP 上找到。

0

相关内容

检查点

中文版《复杂指挥与控制系统的基于模型的系统工程流程》

中文版《复杂指挥与控制系统的基于模型的系统工程流程》

专知会员服务

119+阅读 · 2023年5月31日

【ICML2021】随机森林机器遗忘

专知会员服务

21+阅读 · 2021年8月9日

【2020新书】实战测试自动化，Practical Test Automation，327页pdf

【2020新书】实战测试自动化，Practical Test Automation，327页pdf

专知会员服务

34+阅读 · 2020年8月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【大数据】StreamSets：一个大数据采集工具

【大数据】StreamSets：一个大数据采集工具

产业智能官

40+阅读 · 2018年12月5日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

OP9-DL1诱导的异基因前T细胞联合IL-7转染的间充质干细胞修复化疗损伤小鼠胸腺结构和功能研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于非易失内存设备的数据读写性能优化方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于HPA轴调控和海马核受体MR、GR功能改变探讨HCMV宫内感染致学习记忆损伤的机制

国家自然科学基金

0+阅读 · 2014年12月31日

光学控释靶向温敏纳米颗粒及其光化学协同抗肿瘤研究

国家自然科学基金

0+阅读 · 2013年12月31日

GLDPC码编译码算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向BYOD数据防护机制的多维脆弱性攻击研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于GPU的快速直接解调算法的实现及其在星系团数据中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

具有靶向性且可双重响应细胞内环境非病毒基因载体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

HDACi多酸前导化合物的设计合成和抗肿瘤活性

国家自然科学基金

0+阅读 · 2008年12月31日

乙肝病毒表面抗原之优势性Treg表位鉴定及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

On the Stepwise Nature of Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月30日

BITE: Textual Backdoor Attacks with Iterative Trigger Injection

Arxiv

0+阅读 · 2023年5月29日

Flat singularities of chained systems, illustrated with an aircraft model

Arxiv

0+阅读 · 2023年5月28日

CAPTDURE: Captioned Sound Dataset of Single Sources

Arxiv

0+阅读 · 2023年5月28日

Model Dementia: Generated Data Makes Models Forget

Arxiv

0+阅读 · 2023年5月27日

Downstream Datasets Make Surprisingly Good Pretraining Corpora

Arxiv

0+阅读 · 2023年5月26日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

中文版《复杂指挥与控制系统的基于模型的系统工程流程》

中文版《复杂指挥与控制系统的基于模型的系统工程流程》

专知会员服务

119+阅读 · 2023年5月31日

【ICML2021】随机森林机器遗忘

专知会员服务

21+阅读 · 2021年8月9日

【2020新书】实战测试自动化，Practical Test Automation，327页pdf

【2020新书】实战测试自动化，Practical Test Automation，327页pdf

专知会员服务

34+阅读 · 2020年8月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

手把手教你写 Dart ffi

手把手教你写 Dart ffi

阿里技术

0+阅读 · 2022年11月7日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【大数据】StreamSets：一个大数据采集工具

【大数据】StreamSets：一个大数据采集工具

产业智能官

40+阅读 · 2018年12月5日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

相关论文

On the Stepwise Nature of Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月30日

BITE: Textual Backdoor Attacks with Iterative Trigger Injection

Arxiv

0+阅读 · 2023年5月29日

Flat singularities of chained systems, illustrated with an aircraft model

Arxiv

0+阅读 · 2023年5月28日

CAPTDURE: Captioned Sound Dataset of Single Sources

Arxiv

0+阅读 · 2023年5月28日

Model Dementia: Generated Data Makes Models Forget

Arxiv

0+阅读 · 2023年5月27日

Downstream Datasets Make Surprisingly Good Pretraining Corpora

Arxiv

0+阅读 · 2023年5月26日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

OP9-DL1诱导的异基因前T细胞联合IL-7转染的间充质干细胞修复化疗损伤小鼠胸腺结构和功能研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于非易失内存设备的数据读写性能优化方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于HPA轴调控和海马核受体MR、GR功能改变探讨HCMV宫内感染致学习记忆损伤的机制

国家自然科学基金

0+阅读 · 2014年12月31日

光学控释靶向温敏纳米颗粒及其光化学协同抗肿瘤研究

国家自然科学基金

0+阅读 · 2013年12月31日

GLDPC码编译码算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向BYOD数据防护机制的多维脆弱性攻击研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于GPU的快速直接解调算法的实现及其在星系团数据中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

具有靶向性且可双重响应细胞内环境非病毒基因载体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

HDACi多酸前导化合物的设计合成和抗肿瘤活性

国家自然科学基金

0+阅读 · 2008年12月31日

乙肝病毒表面抗原之优势性Treg表位鉴定及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员