重新采用直向模拟器作为存储二进制网络的主要方法 (Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks) - 专知论文

会员服务 ·

0

估计/估计量 · binary · Networking · 离散化 · Weight ·

2021 年 10 月 19 日

Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

翻译：重新采用直向模拟器作为存储二进制网络的主要方法

Alexander Shekhovtsov,Viktor Yanush

from arxiv, 33 pages, DAGM 2021 version (presented, to be published)

Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful experimental results have been achieved with empirical straight-through (ST) approaches, proposing a variety of ad-hoc rules for propagating gradients through non-differentiable activations and updating discrete weights. At the same time, ST methods can be truly derived as estimators in the stochastic binary network (SBN) model with Bernoulli weights. We advance these derivations to a more complete and systematic study. We analyze properties, estimation accuracy, obtain different forms of correct ST estimators for activations and weights, explain existing empirical approaches and their shortcomings, explain how latent weights arise from the mirror descent method when optimizing over probabilities. This allows to reintroduce ST methods, long known empirically, as sound approximations, apply them with clarity and develop further improvements.

翻译：由于缺乏梯度和对离散重量进行优化的困难,培训具有二进制重量和活化作用的神经网络是一个具有挑战性的问题。许多成功的实验成果都是通过实证直通(ST)方法取得的,为通过非差别性活化来传播梯度提出了各种特别规则,并更新了离散重量。与此同时,可以真正将ST方法作为Stochatic二进制网络(SBN)模型中带有Bernoulli重量的估测器。我们将这些推向更完整和系统的研究。我们分析特性、估计准确性、获得不同形式的正确的ST活化和重量估计器、解释现有的实证方法及其缺点、解释在优化概率超过概率时如何从镜子下沉法中产生潜在重量。这样可以重新采用长期已知的ST方法,作为合理的近似值,将这些方法应用得更清晰,并进一步发展。

0

相关内容

估计/估计量

估计/估计量

基于图的异常检测，94页ppt

专知会员服务

77+阅读 · 2021年9月27日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

专知会员服务

92+阅读 · 2020年1月13日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Average-case Acceleration Through Spectral Density Estimation

Arxiv

0+阅读 · 2021年12月15日

Learning phase field mean curvature flows with neural networks

Arxiv

0+阅读 · 2021年12月14日

Adaptive Projected Residual Networks for Learning Parametric Maps from Sparse Data

Arxiv

0+阅读 · 2021年12月14日

Verified Approximation Algorithms

Arxiv

0+阅读 · 2021年12月13日

Optimal Partitions for Nonparametric Multivariate Entropy Estimation

Arxiv

0+阅读 · 2021年12月12日

Settling the Variance of Multi-Agent Policy Gradients

Arxiv

8+阅读 · 2021年8月20日

Enhancing Robustness of Neural Networks through Fourier Stabilization

Arxiv

6+阅读 · 2021年6月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

基于图的异常检测，94页ppt

专知会员服务

77+阅读 · 2021年9月27日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

专知会员服务

92+阅读 · 2020年1月13日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

现代战争的杀伤区：规模结构、控制手段、生存与战线转移

中文版 | 人工智能时代的任务式指挥

中文版 | 数据投毒：AI驱动战争中优势地位的隐蔽武器

以色列在加沙战争部署新型军事人工智能

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Average-case Acceleration Through Spectral Density Estimation

Arxiv

0+阅读 · 2021年12月15日

Learning phase field mean curvature flows with neural networks

Arxiv

0+阅读 · 2021年12月14日

Adaptive Projected Residual Networks for Learning Parametric Maps from Sparse Data

Arxiv

0+阅读 · 2021年12月14日

Verified Approximation Algorithms

Arxiv

0+阅读 · 2021年12月13日

Optimal Partitions for Nonparametric Multivariate Entropy Estimation

Arxiv

0+阅读 · 2021年12月12日

Settling the Variance of Multi-Agent Policy Gradients

Arxiv

8+阅读 · 2021年8月20日

Enhancing Robustness of Neural Networks through Fourier Stabilization

Arxiv

6+阅读 · 2021年6月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

微信扫码咨询专知VIP会员