数据集蒸馏综合调查 (A Comprehensive Survey of Dataset Distillation) - 专知论文

会员服务 ·

0

蒸馏 · 数据集 · Performer · Learning · 可理解性 ·

2023 年 2 月 7 日

A Comprehensive Survey of Dataset Distillation

翻译：数据集蒸馏综合调查

Shiye Lei,Dacheng Tao

from arxiv, 27 pages, 8 figures

Deep learning technology has developed unprecedentedly in the last decade and has become the primary choice in many application domains. This progress is mainly attributed to a systematic collaboration in which rapidly growing computing resources encourage advanced algorithms to deal with massive data. However, it has gradually become challenging to handle the unlimited growth of data with limited computing power. To this end, diverse approaches are proposed to improve data processing efficiency. Dataset distillation, a dataset reduction method, addresses this problem by synthesizing a small typical dataset from substantial data and has attracted much attention from the deep learning community. Existing dataset distillation methods can be taxonomized into meta-learning and data matching frameworks according to whether they explicitly mimic the performance of target data. Although dataset distillation has shown surprising performance in compressing datasets, there are still several limitations such as distilling high-resolution data. This paper provides a holistic understanding of dataset distillation from multiple aspects, including distillation frameworks and algorithms, factorized dataset distillation, performance comparison, and applications. Finally, we discuss challenges and promising directions to further promote future studies on dataset distillation.

翻译：过去十年来,深层学习技术发展得前所未有,成为许多应用领域的主要选择。这一进展主要归功于系统协作,快速增长的计算机资源鼓励先进的算法处理大量数据。然而,处理计算机功率有限的数据无限增长的问题逐渐变得具有挑战性。为此,提出了多种方法来提高数据处理效率。数据集蒸馏(一种减少数据集的方法)通过综合大量数据的小型典型数据集来解决这一问题,并吸引了深层学习界的极大关注。现有的数据集蒸馏方法可以分类成元学习和数据匹配框架,根据它们是否明确模拟目标数据的性能。虽然数据集蒸馏显示在压缩数据集方面有惊人的性能,但仍有一些局限性,如蒸馏高分辨率数据等。本文对数据元蒸馏从多个方面进行的全面理解,包括蒸馏框架和算法、系数化数据集蒸馏、性比较和应用。最后,我们讨论了进一步推进关于数据集蒸馏的未来研究的挑战和有希望的方向。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图学习怎么用？斯坦福《图深度学习》论坛，Jure Leskovec等众大牛讲述图学习以及在金融、医学及自然语言处理等领域应用

专知会员服务

49+阅读 · 2021年9月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

柔性功能器件大面积印刷制备研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀疏特征的遥感信息高效感知与压缩

国家自然科学基金

2+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

高同型半胱氨酸血症通过调节巨噬细胞亚群而促进脂肪组织胰岛素抵抗-多囊卵巢综合征患者并发症的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

数据驱动的滑坡灾害预测预报方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

泥石流土石组构积聚致阻效应的触发机制与预测预报

国家自然科学基金

0+阅读 · 2011年12月31日

基于压缩感知的土壤微波辐射干涉测量高分辨率成像方法

国家自然科学基金

0+阅读 · 2011年12月31日

宇宙暗物质和弱引力透镜功率谱的信息量研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

Arxiv

34+阅读 · 2023年3月7日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities

Arxiv

52+阅读 · 2022年5月13日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Multimodality in Meta-Learning: A Comprehensive Survey

Arxiv

37+阅读 · 2021年9月28日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

The Deep Learning Compiler: A Comprehensive Survey

Arxiv

15+阅读 · 2020年2月6日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Comprehensive Survey on Graph Neural Networks

A Comprehensive Survey on Graph Neural Networks

Arxiv

13+阅读 · 2019年3月10日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图学习怎么用？斯坦福《图深度学习》论坛，Jure Leskovec等众大牛讲述图学习以及在金融、医学及自然语言处理等领域应用

专知会员服务

49+阅读 · 2021年9月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

Arxiv

34+阅读 · 2023年3月7日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities

Arxiv

52+阅读 · 2022年5月13日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Multimodality in Meta-Learning: A Comprehensive Survey

Arxiv

37+阅读 · 2021年9月28日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

The Deep Learning Compiler: A Comprehensive Survey

Arxiv

15+阅读 · 2020年2月6日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Comprehensive Survey on Graph Neural Networks

A Comprehensive Survey on Graph Neural Networks

Arxiv

13+阅读 · 2019年3月10日

相关基金

柔性功能器件大面积印刷制备研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀疏特征的遥感信息高效感知与压缩

国家自然科学基金

2+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

高同型半胱氨酸血症通过调节巨噬细胞亚群而促进脂肪组织胰岛素抵抗-多囊卵巢综合征患者并发症的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

数据驱动的滑坡灾害预测预报方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

泥石流土石组构积聚致阻效应的触发机制与预测预报

国家自然科学基金

0+阅读 · 2011年12月31日

基于压缩感知的土壤微波辐射干涉测量高分辨率成像方法

国家自然科学基金

0+阅读 · 2011年12月31日

宇宙暗物质和弱引力透镜功率谱的信息量研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员