Disparse: 多任务模型压缩模型的分离分解分解 (DiSparse: Disentangled Sparsification for Multitask Model Compression) - 专知论文

会员服务 ·

0

Learning · Performer · 剪枝 · 稀疏化 · MoDELS ·

2022 年 6 月 9 日

DiSparse: Disentangled Sparsification for Multitask Model Compression

翻译：Disparse: 多任务模型压缩模型的分离分解分解

Xinglong Sun,Ali Hassani,Zhangyang Wang,Gao Huang,Humphrey Shi

from arxiv, Accepted at CVPR 2022

Despite the popularity of Model Compression and Multitask Learning, how to effectively compress a multitask model has been less thoroughly analyzed due to the challenging entanglement of tasks in the parameter space. In this paper, we propose DiSparse, a simple, effective, and first-of-its-kind multitask pruning and sparse training scheme. We consider each task independently by disentangling the importance measurement and take the unanimous decisions among all tasks when performing parameter pruning and selection. Our experimental results demonstrate superior performance on various configurations and settings compared to popular sparse training and pruning methods. Besides the effectiveness in compression, DiSparse also provides a powerful tool to the multitask learning community. Surprisingly, we even observed better performance than some dedicated multitask learning methods in several cases despite the high model sparsity enforced by DiSparse. We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts. We also observe the existence of a "watershed" layer where the task relatedness sharply drops, implying no benefits in continued parameters sharing. Our code and models will be available at: https://github.com/SHI-Labs/DiSparse-Multitask-Model-Compression.

翻译：尽管模型压缩和多任务学习受到欢迎,但由于参数空间中任务交织的难度很大,如何有效压缩多任务模型的分析不够透彻。我们在此文件中提议Disparse,一个简单、有效、首创的多任务编程和零散的培训计划。我们独立地考虑每项任务,在进行参数剪裁和选择时,将重要度量度脱钩,在所有任务中作出一致决定。我们的实验结果显示,与流行的稀少培训和裁剪方法相比,各种配置和设置的绩效优异。除了压缩的效果外,Disparse还为多任务学习界提供了一个强大的工具。令人惊讶的是,尽管Disparse强制推行了高模型宽度的多任务编程和零星学习方法,我们甚至在若干情况下观察到了比一些专门的多任务学习方法更好的业绩。我们分析了与Disparse生成的剪辑口罩,并观察到了每项任务甚至在培训开始前所发现的惊人相似的稀少的网络结构。我们还看到存在一个“水压式”层,其中任务相关度急剧下降,Dus-Busima/Discostruals。我们没有共享的代码和参数。

0

相关内容

Learning

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

多任务学习(Multitask-Learning)相关资料、经典论文、开源代码整理分享

多任务学习(Multitask-Learning)相关资料、经典论文、开源代码整理分享

深度学习与NLP

45+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

TRAP1在赭曲霉毒素A干扰肾细胞凋亡与自噬内稳态中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

可积空间上的谱与框架的存在性问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

线粒体MAVS蛋白在抗H3N2亚型猪流感病毒感染过程中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

从内质网应激介导的CHOP凋亡途径探讨BPD发生机制

国家自然科学基金

0+阅读 · 2012年12月31日

集成陆面过程模式中LUCC动力学模型的发展及应用示范研究

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1/TGFBR1调控釉成熟蛋白（Amelotin）表达的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

Comparing baseball players across eras via the novel Full House Model

Arxiv

0+阅读 · 2022年7月22日

Quantized Sparse Weight Decomposition for Neural Network Compression

Arxiv

0+阅读 · 2022年7月22日

Optimizing Image Compression via Joint Learning with Denoising

Arxiv

0+阅读 · 2022年7月22日

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Arxiv

0+阅读 · 2022年7月22日

Towards Lightweight Super-Resolution with Dual Regression Learning

Arxiv

0+阅读 · 2022年7月21日

Model Compression for Resource-Constrained Mobile Robots

Arxiv

0+阅读 · 2022年7月20日

Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling

Arxiv

13+阅读 · 2021年12月3日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

多任务学习(Multitask-Learning)相关资料、经典论文、开源代码整理分享

多任务学习(Multitask-Learning)相关资料、经典论文、开源代码整理分享

深度学习与NLP

45+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Comparing baseball players across eras via the novel Full House Model

Arxiv

0+阅读 · 2022年7月22日

Quantized Sparse Weight Decomposition for Neural Network Compression

Arxiv

0+阅读 · 2022年7月22日

Optimizing Image Compression via Joint Learning with Denoising

Arxiv

0+阅读 · 2022年7月22日

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Arxiv

0+阅读 · 2022年7月22日

Towards Lightweight Super-Resolution with Dual Regression Learning

Arxiv

0+阅读 · 2022年7月21日

Model Compression for Resource-Constrained Mobile Robots

Arxiv

0+阅读 · 2022年7月20日

Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling

Arxiv

13+阅读 · 2021年12月3日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

相关基金

TRAP1在赭曲霉毒素A干扰肾细胞凋亡与自噬内稳态中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

可积空间上的谱与框架的存在性问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

线粒体MAVS蛋白在抗H3N2亚型猪流感病毒感染过程中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

从内质网应激介导的CHOP凋亡途径探讨BPD发生机制

国家自然科学基金

0+阅读 · 2012年12月31日

集成陆面过程模式中LUCC动力学模型的发展及应用示范研究

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1/TGFBR1调控釉成熟蛋白（Amelotin）表达的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员