深层动态神经网络中的嵌入式知识蒸馏 (Embedded Knowledge Distillation in Depth-Level Dynamic Neural Network) - 专知论文

会员服务 ·

0

DDNN · Networking · Neural Networks · 蒸馏 · Performer ·

2021 年 8 月 10 日

Embedded Knowledge Distillation in Depth-Level Dynamic Neural Network

翻译：深层动态神经网络中的嵌入式知识蒸馏

Qi Zhao,Shuchang Lyu,Zhiwei Zhang,Ting-Bing Xu,Guangliang Cheng

from arxiv, 4 pages, 3 figures; Accepted by CVPR2021 workshop: Dynamic Neural Networks Meets Computer Vision

In real applications, different computation-resource devices need different-depth networks (e.g., ResNet-18/34/50) with high-accuracy. Usually, existing methods either design multiple networks and train them independently, or construct depth-level/width-level dynamic neural networks which is hard to prove the accuracy of each sub-net. In this article, we propose an elegant Depth-Level Dynamic Neural Network (DDNN) integrated different-depth sub-nets of similar architectures. To improve the generalization of sub-nets, we design the Embedded-Knowledge-Distillation (EKD) training mechanism for the DDNN to implement knowledge transfer from the teacher (full-net) to multiple students (sub-nets). Specifically, the Kullback-Leibler (KL) divergence is introduced to constrain the posterior class probability consistency between full-net and sub-nets, and self-attention distillation on the same resolution feature of different depth is addressed to drive more abundant feature representations of sub-nets. Thus, we can obtain multiple high-accuracy sub-nets simultaneously in a DDNN via the online knowledge distillation in each training iteration without extra computation cost. Extensive experiments on CIFAR-10/100, and ImageNet datasets demonstrate that sub-nets in DDNN with EKD training achieve better performance than individually training networks while preserving the original performance of full-nets.

翻译：在实际应用中,不同的计算资源装置需要具有高度准确性的不同深度网络(如ResNet-18/34/50),通常,现有方法要么设计多个网络并对其进行独立培训,要么建立深度/高度动态神经网络,很难证明每个子网的准确性。在本篇文章中,我们建议建立一个优雅的深度水平动态神经网络(DDNN),整合类似结构的不同深度子网。为了改进原始网络的概括化,我们为DDNN设计了嵌入式知识蒸馏(EKD)培训机制,以便从教师(全网)向多个学生(子网)进行知识转让。具体地说,Kullback-Lebel (KL) 差异在于限制全网和子网之间的等级概率一致性,以及在不同深度的同一分辨率的自我蒸馏,目的是推动更丰富的子网络的特征展示。因此,我们可以在不通过内部数据库进行高级精确性能测试的同时,通过内部网络进行多种高性能性能测试,然后通过内部网络进行高级性能测试。

0

相关内容

DDNN

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知

41+阅读 · 2020年3月25日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

5+阅读 · 2019年5月5日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

论文浅尝 | Leveraging Knowledge Bases in LSTMs

论文浅尝 | Leveraging Knowledge Bases in LSTMs

开放知识图谱

6+阅读 · 2017年12月8日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation

Arxiv

0+阅读 · 2021年10月7日

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Arxiv

0+阅读 · 2021年10月6日

Progressive Network Grafting for Few-Shot Knowledge Distillation

Progressive Network Grafting for Few-Shot Knowledge Distillation

Arxiv

4+阅读 · 2020年12月9日

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

SEEK: Segmented Embedding of Knowledge Graphs

Arxiv

8+阅读 · 2020年5月2日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

19+阅读 · 2019年11月20日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph

Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph

Arxiv

7+阅读 · 2019年10月8日

Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Arxiv

3+阅读 · 2018年8月11日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知

41+阅读 · 2020年3月25日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

5+阅读 · 2019年5月5日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

论文浅尝 | Leveraging Knowledge Bases in LSTMs

论文浅尝 | Leveraging Knowledge Bases in LSTMs

开放知识图谱

6+阅读 · 2017年12月8日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation

Arxiv

0+阅读 · 2021年10月7日

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Arxiv

0+阅读 · 2021年10月6日

Progressive Network Grafting for Few-Shot Knowledge Distillation

Progressive Network Grafting for Few-Shot Knowledge Distillation

Arxiv

4+阅读 · 2020年12月9日

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

SEEK: Segmented Embedding of Knowledge Graphs

Arxiv

8+阅读 · 2020年5月2日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

19+阅读 · 2019年11月20日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph

Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph

Arxiv

7+阅读 · 2019年10月8日

Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Arxiv

3+阅读 · 2018年8月11日

微信扫码咨询专知VIP会员