微缩图像网模型传输如何? (How Well Do Sparse Imagenet Models Transfer?) - 专知论文

会员服务 ·

0

剪枝 · MoDELS · Performer · ImageNet (数据集) · 稀疏 ·

2021 年 12 月 23 日

How Well Do Sparse Imagenet Models Transfer?

翻译：微缩图像网模型传输如何?

Eugenia Iofinova,Alexandra Peste,Mark Kurtz,Dan Alistarh

from arxiv, 19 pages, 8 figures. Latest update is a correction to Fig. 2

Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream," specialized datasets. Generally, it is understood that more accurate models on the "upstream" dataset will provide better transfer accuracy "downstream". In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset, which have been pruned - that is, compressed by sparsifiying their connections. Specifically, we consider transfer using unstructured pruned models obtained by applying several state-of-the-art pruning methods, including magnitude-based, second-order, re-growth and regularization approaches, in the context of twelve standard transfer tasks. In a nutshell, our study shows that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities, and, while doing so, can lead to significant inference and even training speedups. At the same time, we observe and analyze significant differences in the behaviour of different pruning methods.

翻译：传输学习是一种典型的范例,通过这种模式,在大型“ 上流” 数据集上预先培训的模型能够适应在“ 下流” 和专门数据集上产生良好结果。一般而言,人们理解,在“ 上流” 数据集上更精确的模型将提供更好的传输准确性“ 下游 ” 。在这项工作中,我们深入地调查了在图像网络数据集上受过培训的“ 进化神经网络” 背景下的这一现象,这些网络已被扎根—— 也就是通过对其连接进行垃圾过滤而压缩。具体地说,我们考虑使用通过应用若干最先进的运行方法,包括基于数量、第二顺序、再增长和正规化的方法,在12项标准传输任务中,获得的非结构化的调整模型,从而产生良好的效果。在一项研究中,我们的研究显示,即使高度紧张的模型也能够匹配甚至超过密度模型的传输性能,同时可以导致重大的推断甚至培训速度。同时,我们观察并分析不同运行方法行为中的重大差异。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经元和星形胶质细胞特异性miRNA对神经网络发育和功能的调控机制

国家自然科学基金

1+阅读 · 2013年12月31日

基于超稀疏结构学习的压缩感知重建研究

国家自然科学基金

5+阅读 · 2013年12月31日

多模态层级靶向脑胶质瘤siRNA纳米影像载体系统可控制备与抑瘤活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

红树林生境异质性的时空尺度效应与鱼类多样性的维持机制

国家自然科学基金

0+阅读 · 2012年12月31日

超宽带通信数字接收机的压缩采样技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

葡萄芪化物对细胞凋亡的抑制作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

基于Top-hat变换的多尺度多梯度多结构元素图像处理技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

人脸线性鉴别特征提取方法的深化研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Representation Collapse of Sparse Mixture of Experts

Arxiv

0+阅读 · 2022年4月20日

Impact of Tokenization on Language Models: An Analysis for Turkish

Arxiv

0+阅读 · 2022年4月19日

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

Arxiv

0+阅读 · 2022年4月18日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

ImageNet (数据集)

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

On the Representation Collapse of Sparse Mixture of Experts

Arxiv

0+阅读 · 2022年4月20日

Impact of Tokenization on Language Models: An Analysis for Turkish

Arxiv

0+阅读 · 2022年4月19日

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

Arxiv

0+阅读 · 2022年4月18日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经元和星形胶质细胞特异性miRNA对神经网络发育和功能的调控机制

国家自然科学基金

1+阅读 · 2013年12月31日

基于超稀疏结构学习的压缩感知重建研究

国家自然科学基金

5+阅读 · 2013年12月31日

多模态层级靶向脑胶质瘤siRNA纳米影像载体系统可控制备与抑瘤活性研究

国家自然科学基金

0+阅读 · 2012年12月31日

红树林生境异质性的时空尺度效应与鱼类多样性的维持机制

国家自然科学基金

0+阅读 · 2012年12月31日

超宽带通信数字接收机的压缩采样技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

葡萄芪化物对细胞凋亡的抑制作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

基于Top-hat变换的多尺度多梯度多结构元素图像处理技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

人脸线性鉴别特征提取方法的深化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员