代用梯度设计 (Surrogate Gradients Design) - 专知论文

会员服务 ·

0

Extensibility · 可约的 · Networking · 网格搜索 · 神经形态计算 ·

2022 年 2 月 1 日

Surrogate Gradients Design

翻译：代用梯度设计

Luca Herranz-Celotti,Jean Rouat

Surrogate gradient (SG) training provides the possibility to quickly transfer all the gains made in deep learning to neuromorphic computing and neuromorphic processors, with the consequent reduction in energy consumption. Evidence supports that training can be robust to the choice of SG shape, after an extensive search of hyper-parameters. However, random or grid search of hyper-parameters becomes exponentially unfeasible as we consider more hyper-parameters. Moreover, every point in the search can itself be highly time and energy consuming for large networks and large datasets. In this article we show how complex tasks and networks are more sensitive to SG choice. Secondly, we show how low dampening, high sharpness and low tail fatness are preferred. Thirdly, we observe that Glorot Uniform initialization is generally preferred by most SG choices, with variability in the results. We finally provide a theoretical solution to reduce the need of extensive gridsearch, to find SG shape and initializations that result in improved accuracy.

翻译：代用梯度(SG)培训提供了将深层学习中取得的所有成果迅速转移给神经形态计算和神经形态处理器的可能性,从而导致能源消耗减少。有证据表明,经过对超参数进行广泛搜索后,培训对于选择SG形状来说是强有力的。然而,随着我们考虑更多的超参数,对超参数的随机或网格搜索变得极不可行。此外,搜索的每一点本身都可能耗费大量的时间和能源,对大型网络和大型数据集来说都是如此。在本条中,我们展示了如何复杂的任务和网络对SG的选择更加敏感。第二,我们显示了如何倾向于低压、高锐度和低尾部脂肪。第三,我们观察到,大多数SG的选择通常倾向于Glorot统一初始化,结果也各不相同。我们最终提供了理论解决方案,以减少对大网络和大型数据集进行广泛的网格搜索的需要,从而找到能够提高准确度的SG形状和初始化。

0

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

30分钟快速了解机器学习，CBIO Chloé-Agathe Azencott讲解，41页ppt

30分钟快速了解机器学习，CBIO Chloé-Agathe Azencott讲解，41页ppt

专知会员服务

25+阅读 · 2021年10月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

PV中间神经元介导的γ振荡神经微环路在氯胺酮抗抑郁中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

基于长余辉发光纳米探针的前列腺癌细胞表面聚糖原位分析

国家自然科学基金

0+阅读 · 2014年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

“开关型”荧光纳米探针用于活细胞内生物分子的检测

国家自然科学基金

0+阅读 · 2012年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

恶性疟原虫入侵红细胞相关肝素结合蛋白组的高通量筛选和鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

因果推断的统计方法

国家自然科学基金

26+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

利用超支化聚合物构建高效安全的非病毒基因载体

国家自然科学基金

0+阅读 · 2009年12月31日

LORD: Lower-Dimensional Embedding of Log-Signature in Neural Rough Differential Equations

Arxiv

0+阅读 · 2022年4月19日

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Arxiv

0+阅读 · 2022年4月18日

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Arxiv

1+阅读 · 2022年4月18日

Convergence analysis of a two-grid method for nonsymmetric positive definite problems

Arxiv

0+阅读 · 2022年4月17日

Investigating Positive and Negative Qualities of Human-in-the-Loop Optimization for Designing Interaction Techniques

Arxiv

0+阅读 · 2022年4月15日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

VIP会员

文章信息

相关主题

神经形态计算

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

30分钟快速了解机器学习，CBIO Chloé-Agathe Azencott讲解，41页ppt

30分钟快速了解机器学习，CBIO Chloé-Agathe Azencott讲解，41页ppt

专知会员服务

25+阅读 · 2021年10月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

LORD: Lower-Dimensional Embedding of Log-Signature in Neural Rough Differential Equations

Arxiv

0+阅读 · 2022年4月19日

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Arxiv

0+阅读 · 2022年4月18日

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Arxiv

1+阅读 · 2022年4月18日

Convergence analysis of a two-grid method for nonsymmetric positive definite problems

Arxiv

0+阅读 · 2022年4月17日

Investigating Positive and Negative Qualities of Human-in-the-Loop Optimization for Designing Interaction Techniques

Arxiv

0+阅读 · 2022年4月15日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

相关基金

PV中间神经元介导的γ振荡神经微环路在氯胺酮抗抑郁中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

基于长余辉发光纳米探针的前列腺癌细胞表面聚糖原位分析

国家自然科学基金

0+阅读 · 2014年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

“开关型”荧光纳米探针用于活细胞内生物分子的检测

国家自然科学基金

0+阅读 · 2012年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

恶性疟原虫入侵红细胞相关肝素结合蛋白组的高通量筛选和鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

因果推断的统计方法

国家自然科学基金

26+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

利用超支化聚合物构建高效安全的非病毒基因载体

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员