具有一致蒸馏点的探测变异器知识蒸馏 (Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling) - 专知论文

会员服务 ·

0

蒸馏 · 知识 (knowledge) · Extensibility · 变换 · Backbone ·

2022 年 11 月 16 日

Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling

翻译：具有一致蒸馏点的探测变异器知识蒸馏

Yu Wang,Xin Li,Shengzhao Wen,Fukui Yang,Wanping Zhang,Gang Zhang,Haocheng Feng,Junyu Han,Errui Ding

DETR is a novel end-to-end transformer architecture object detector, which significantly outperforms classic detectors when scaling up the model size. In this paper, we focus on the compression of DETR with knowledge distillation. While knowledge distillation has been well-studied in classic detectors, there is a lack of researches on how to make it work effectively on DETR. We first provide experimental and theoretical analysis to point out that the main challenge in DETR distillation is the lack of consistent distillation points. Distillation points refer to the corresponding inputs of the predictions for student to mimic, and reliable distillation requires sufficient distillation points which are consistent between teacher and student. Based on this observation, we propose a general knowledge distillation paradigm for DETR(KD-DETR) with consistent distillation points sampling. Specifically, we decouple detection and distillation tasks by introducing a set of specialized object queries to construct distillation points. In this paradigm, we further propose a general-to-specific distillation points sampling strategy to explore the extensibility of KD-DETR. Extensive experiments on different DETR architectures with various scales of backbones and transformer layers validate the effectiveness and generalization of KD-DETR. KD-DETR boosts the performance of DAB-DETR with ResNet-18 and ResNet-50 backbone to 41.4$\%$, 45.7$\%$ mAP, respectively, which are 5.2$\%$, 3.5$\%$ higher than the baseline, and ResNet-50 even surpasses the teacher model by $2.2\%$.

翻译：DETR是一个新的端到端变压器结构物体探测器,在扩大模型规模时,它明显优于经典探测器。在本文中,我们侧重于通过知识蒸馏压缩DETR。虽然知识蒸馏在经典探测器中研究过,但缺乏关于如何使DETR(KD-DETR)有效运行的研究。我们首先提供实验和理论分析,指出DETR蒸馏的主要挑战是缺乏一致的蒸馏点。蒸馏点是指学生模拟预测的相应投入,可靠的蒸馏需要足够的蒸馏点,这与师生之间是一致的。根据这一观察,我们提出了DTR(KD-DETR)一般蒸馏点取样的普通知识蒸馏模式。我们首先提供一套专门的对象查询,以构建蒸馏点。在这个模型中,我们进一步建议一个通用的蒸馏点对学生进行模拟预测的对应投入,而可靠的蒸馏点需要有足够的蒸馏点,而教师和学生之间的蒸馏点需要有足够的蒸馏点。根据这一观察,我们建议为DETR(K-DE-DE-DE-DE-DE-DE-DE-DE-DE-Q-L-S-S-Sl-S-Syal-Slal-Slal-S-S-S-Syal-Syal-Slationxxxxxxxxxxxxxxxxxxxxxxxxxxx 和S-DE-DE-DE-DE-DE-DE-S-S-S-S-S-D-S-S-S-S-S-S-S-S-S-S-D-S-S-D-S-S-S-S-D-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-D-D-Sl-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

非小细胞肺癌中黄连素干预TF/FVIIa通路抑制转移的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

双靶点干预对糖尿病视网膜病变中“血管-纤维化开关”的调控

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA n332962参与LPS调控人牙髓干细胞分化的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-137/YB-1调控肿瘤细胞干性参与卵巢癌耐药调节的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ca2+信号通路介导猪骨髓MSCs成脂分化的分子机制及其营养调控

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

TNFR2内化修饰在糖尿病视网膜病变中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

视网膜血管内皮细胞周期调控与糖尿病视网膜病变发生发展关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Memory Efficient Continual Learning with Transformers

Memory Efficient Continual Learning with Transformers

Arxiv

0+阅读 · 2023年1月13日

Towards Large-Scale Small Object Detection: Survey and Benchmarks

Arxiv

40+阅读 · 2022年7月28日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Deep Learning for Generic Object Detection: A Survey

Deep Learning for Generic Object Detection: A Survey

Arxiv

14+阅读 · 2018年9月6日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS 2025】稳定电影度量：面向专业视频生成的结构化分类与评测体系

战场AI决策支持系统

【博士论文】面向排序与扩散模型的安全、高效与鲁棒强化学习

面向 AI 生成图像的安全与鲁棒水印：全面综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

相关论文

Memory Efficient Continual Learning with Transformers

Memory Efficient Continual Learning with Transformers

Arxiv

0+阅读 · 2023年1月13日

Towards Large-Scale Small Object Detection: Survey and Benchmarks

Arxiv

40+阅读 · 2022年7月28日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Deep Learning for Generic Object Detection: A Survey

Deep Learning for Generic Object Detection: A Survey

Arxiv

14+阅读 · 2018年9月6日

相关基金

非小细胞肺癌中黄连素干预TF/FVIIa通路抑制转移的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

双靶点干预对糖尿病视网膜病变中“血管-纤维化开关”的调控

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA n332962参与LPS调控人牙髓干细胞分化的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-137/YB-1调控肿瘤细胞干性参与卵巢癌耐药调节的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ca2+信号通路介导猪骨髓MSCs成脂分化的分子机制及其营养调控

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

TNFR2内化修饰在糖尿病视网膜病变中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

视网膜血管内皮细胞周期调控与糖尿病视网膜病变发生发展关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员