选择一个符号: 优化带有渐变信号退出的深多任务模型 (Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout) - 专知论文

会员服务 ·

0

优化器 · 暂退法 · MoDELS · 层 · state-of-the-art ·

2020 年 10 月 14 日

Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout

翻译：选择一个符号: 优化带有渐变信号退出的深多任务模型

Zhao Chen,Jiquan Ngiam,Yanping Huang,Thang Luong,Henrik Kretzschmar,Yuning Chai,Dragomir Anguelov

from arxiv, Conference on Neural Information Processing Systems (NeurIPS) 2020

The vast majority of deep models use multiple gradient signals, typically corresponding to a sum of multiple loss terms, to update a shared set of trainable weights. However, these multiple updates can impede optimal training by pulling the model in conflicting directions. We present Gradient Sign Dropout (GradDrop), a probabilistic masking procedure which samples gradients at an activation layer based on their level of consistency. GradDrop is implemented as a simple deep layer that can be used in any deep net and synergizes with other gradient balancing approaches. We show that GradDrop outperforms the state-of-the-art multiloss methods within traditional multitask and transfer learning settings, and we discuss how GradDrop reveals links between optimal multiloss training and gradient stochasticity.

翻译：绝大多数深层模型使用多个梯度信号,通常与多个损失值之和相对应,以更新一套共同的可训练重量。然而,这些多重更新会通过将模型引向相互冲突的方向而阻碍最佳培训。我们介绍了一种概率化掩码程序,即根据一致性程度在活化层取样梯度。梯度是作为一个简单的深层层执行的,可用于任何深网,并与其他梯度平衡方法协同使用。我们显示,格拉德罗普在传统的多任务和转移学习环境中超越了最先进的多损失方法,我们讨论了梯度如何揭示最佳多损失培训与梯度转换之间的联系。

3

相关内容

优化器

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

MATLAB玩转深度学习？新书「MATLAB Deep Learning」162页pdf

MATLAB玩转深度学习？新书「MATLAB Deep Learning」162页pdf

专知会员服务

103+阅读 · 2020年1月13日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

PyTorch中在反向传播前为什么要手动将梯度清零？

PyTorch中在反向传播前为什么要手动将梯度清零？

极市平台

39+阅读 · 2019年1月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Arxiv

0+阅读 · 2020年12月1日

Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient

Arxiv

0+阅读 · 2020年11月27日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

3+阅读 · 2018年4月5日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

MATLAB玩转深度学习？新书「MATLAB Deep Learning」162页pdf

MATLAB玩转深度学习？新书「MATLAB Deep Learning」162页pdf

专知会员服务

103+阅读 · 2020年1月13日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

PyTorch中在反向传播前为什么要手动将梯度清零？

PyTorch中在反向传播前为什么要手动将梯度清零？

极市平台

39+阅读 · 2019年1月23日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Arxiv

0+阅读 · 2020年12月1日

Deep Active Learning for Sequence Labeling Based on Diversity and Uncertainty in Gradient

Arxiv

0+阅读 · 2020年11月27日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

3+阅读 · 2018年4月5日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

微信扫码咨询专知VIP会员