关于培训隐含培训模式 (On Training Implicit Models) - 专知论文

会员服务 ·

0

确切的 · MoDELS · 后向 · 有向 · Performer ·

2021 年 11 月 9 日

On Training Implicit Models

翻译：关于培训隐含培训模式

Zhengyang Geng,Xin-Yu Zhang,Shaojie Bai,Yisen Wang,Zhouchen Lin

from arxiv, 24 pages, 4 figures, in The 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

This paper focuses on training implicit models of infinite layers. Specifically, previous works employ implicit differentiation and solve the exact gradient for the backward propagation. However, is it necessary to compute such an exact but expensive gradient for training? In this work, we propose a novel gradient estimate for implicit models, named phantom gradient, that 1) forgoes the costly computation of the exact gradient; and 2) provides an update direction empirically preferable to the implicit model training. We theoretically analyze the condition under which an ascent direction of the loss landscape could be found, and provide two specific instantiations of the phantom gradient based on the damped unrolling and Neumann series. Experiments on large-scale tasks demonstrate that these lightweight phantom gradients significantly accelerate the backward passes in training implicit models by roughly 1.7 times, and even boost the performance over approaches based on the exact gradient on ImageNet.

翻译：本文侧重于培训无限层的隐含模型。具体地说, 先前的工程使用隐含差异, 并解决了后向传播的确切梯度。但是, 是否有必要为培训计算出这样一个精确但昂贵的梯度? 在这项工作中, 我们提议对隐含模型( 名为幻影梯度 ) 进行新的梯度估计, 1 即放弃精确梯度的昂贵计算; 2) 提供了比隐含模式培训更可取的更新方向。我们从理论上分析了可以找到损失地貌的高度方向的条件, 并且提供了两种基于摇动不动和 Neumann 系列的幽灵梯度的具体速率。大规模任务实验表明, 这些轻量的幽灵梯度梯度大大加快了培训隐含模型的后向传递速度, 大约1.7 次, 甚至根据图像网的精确梯度提高方法的性能。

0

相关内容

确切的

图像修复研究进展综述

专知会员服务

64+阅读 · 2021年3月9日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

On generalization bounds for deep networks based on loss surface implicit regularization

Arxiv

0+阅读 · 2022年1月12日

Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Arxiv

7+阅读 · 2021年12月20日

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Arxiv

4+阅读 · 2021年7月5日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Pyramidal RoR for Image Classification

Arxiv

3+阅读 · 2017年10月1日

VIP会员

文章信息

相关主题

相关VIP内容

图像修复研究进展综述

专知会员服务

64+阅读 · 2021年3月9日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

【论文】自训练噪声student模型提高ImageNet分类准确率（Self-training with Noisy Student improves ImageNet classification），谷歌研究科学家Quoc V. Le等

专知会员服务

24+阅读 · 2019年11月20日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

On generalization bounds for deep networks based on loss surface implicit regularization

Arxiv

0+阅读 · 2022年1月12日

Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Arxiv

7+阅读 · 2021年12月20日

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Arxiv

4+阅读 · 2021年7月5日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Pyramidal RoR for Image Classification

Arxiv

3+阅读 · 2017年10月1日

微信扫码咨询专知VIP会员