无法提炼的:让一个讨厌的老师教学生 (Undistillable: Making A Nasty Teacher That CANNOT teach students) - 专知论文

会员服务 ·

0

Performer · MoDELS · Extensibility · 规范化的 · 蒸馏 ·

2021 年 5 月 16 日

Undistillable: Making A Nasty Teacher That CANNOT teach students

翻译：无法提炼的:让一个讨厌的老师教学生

Haoyu Ma,Tianlong Chen,Ting-Kuei Hu,Chenyu You,Xiaohui Xie,Zhangyang Wang

from arxiv, ICLR 2021(Spotlight). Code is available at https://github.com/VITA-Group/Nasty-Teacher

Knowledge Distillation (KD) is a widely used technique to transfer knowledge from pre-trained teacher models to (usually more lightweight) student models. However, in certain situations, this technique is more of a curse than a blessing. For instance, KD poses a potential risk of exposing intellectual properties (IPs): even if a trained machine learning model is released in 'black boxes' (e.g., as executable software or APIs without open-sourcing code), it can still be replicated by KD through imitating input-output behaviors. To prevent this unwanted effect of KD, this paper introduces and investigates a concept called Nasty Teacher: a specially trained teacher network that yields nearly the same performance as a normal one, but would significantly degrade the performance of student models learned by imitating it. We propose a simple yet effective algorithm to build the nasty teacher, called self-undermining knowledge distillation. Specifically, we aim to maximize the difference between the output of the nasty teacher and a normal pre-trained network. Extensive experiments on several datasets demonstrate that our method is effective on both standard KD and data-free KD, providing the desirable KD-immunity to model owners for the first time. We hope our preliminary study can draw more awareness and interest in this new practical problem of both social and legal importance.

翻译：知识蒸馏(KD)是一种广泛使用的技术,用于将知识从经过培训的教师模型转移到(通常更轻巧的)学生模型。然而,在某些情况下,这种技术更是一种诅咒,而不是一种祝福。例如,KD带来了暴露知识产权(IPs)的潜在风险:即使一个经过训练的机器学习模型在“黑盒子”中发布(例如,作为可执行软件或没有开源代码的API),它仍然可以通过模仿输入输出行为来让KD复制。为了防止KD的这种不想要的效果,本文介绍并调查了一个名为Nasty Deacher的概念:一个经过专门训练的教师网络,其性能几乎与正常的几乎相同,但通过模仿它会大大降低学生模型的性能。我们提出了一个简单而有效的算法,用来培养讨厌的教师,称为自我挖掘知识蒸馏。具体地说,我们的目标是通过模仿进取教师的输出和一个正常的预培训网络,从而尽可能扩大质量。关于几个数据集的实验表明,我们的方法在标准KD和无实际的KD标准时间和数据所有人中都有效,我们提出了这个可取的社会问题研究。

1

相关内容

Performer

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

专知

8+阅读 · 2018年6月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

老铁，邀请你来免费学习人工智能！！！

老铁，邀请你来免费学习人工智能！！！

量化投资与机器学习

4+阅读 · 2017年11月14日

Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

Arxiv

1+阅读 · 2021年7月7日

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Arxiv

0+阅读 · 2021年7月6日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Topology Distillation for Recommender System

Arxiv

9+阅读 · 2021年6月16日

General Instance Distillation for Object Detection

Arxiv

9+阅读 · 2021年3月3日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

Arxiv

4+阅读 · 2019年11月6日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Knowledge Flow: Improve Upon Your Teachers

Knowledge Flow: Improve Upon Your Teachers

Arxiv

5+阅读 · 2019年4月11日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

专知

8+阅读 · 2018年6月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

老铁，邀请你来免费学习人工智能！！！

老铁，邀请你来免费学习人工智能！！！

量化投资与机器学习

4+阅读 · 2017年11月14日

相关论文

Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

Arxiv

1+阅读 · 2021年7月7日

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Arxiv

0+阅读 · 2021年7月6日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Topology Distillation for Recommender System

Arxiv

9+阅读 · 2021年6月16日

General Instance Distillation for Object Detection

Arxiv

9+阅读 · 2021年3月3日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

Arxiv

4+阅读 · 2019年11月6日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Knowledge Flow: Improve Upon Your Teachers

Knowledge Flow: Improve Upon Your Teachers

Arxiv

5+阅读 · 2019年4月11日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

微信扫码咨询专知VIP会员