关于在Polyak-Lojasiewicz不平等状况下 " 梯级人种法 " 与 " 存储固定点四舍五入 " 差错的趋同 (On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality) - 专知论文

会员服务 ·

0

梯度下降法 · 有偏 · ForCES · 梯度消失问题 · Performer ·

2023 年 1 月 23 日

On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality

翻译：关于在Polyak-Lojasiewicz不平等状况下 " 梯级人种法 " 与 " 存储固定点四舍五入 " 差错的趋同

Lu Xia,Michiel E. Hochstenbach,Stefano Massei

When training neural networks with low-precision computation, rounding errors often cause stagnation or are detrimental to the convergence of the optimizers; in this paper we study the influence of rounding errors on the convergence of the gradient descent method for problems satisfying the Polyak-Lojasiewicz inequality. Within this context, we show that, in contrast, biased stochastic rounding errors may be beneficial since choosing a proper rounding strategy eliminates the vanishing gradient problem and forces the rounding bias in a descent direction. Furthermore, we obtain a bound on the convergence rate that is stricter than the one achieved by unbiased stochastic rounding. The theoretical analysis is validated by comparing the performances of various rounding strategies when optimizing several examples using low-precision fixed-point number formats.

翻译：当对低精确度计算神经网络进行培训时,四舍五入错误往往造成停滞,或不利于优化的趋同;在本文件中,我们研究了四舍五入对于满足Polyak-Lojasiewicz不平等的问题的梯度下降法趋同作用的影响;在这方面,我们表明,相反,偏差随机四舍五入错误可能是有益的,因为选择适当的四舍五入战略消除了渐变的梯度问题,迫使四舍五入的偏向向向下降。此外,我们获得了比不偏心的四舍五入法更严格的趋同率约束。在利用低精确度固定点数格式优化几个例子时,对各种四舍五入战略的绩效进行比较,从而验证了理论分析。

0

相关内容

梯度下降法

梯度下降法

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于光学超晶格实现光纤通讯和量子存储波段的多色连续变量纠缠光场

国家自然科学基金

0+阅读 · 2015年12月31日

振荡型积分的有界性质及其在色散方程中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

动力系统量和特征值的强连续性及最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

基于多重分形理论的岩体失稳破裂微震前兆信息识别

国家自然科学基金

0+阅读 · 2012年12月31日

不确定环境下集装箱码头物流运作能力仿真建模与动态评估

国家自然科学基金

0+阅读 · 2011年12月31日

Fermi伽玛射线脉冲星及脉冲星风云的高能物理特性

国家自然科学基金

0+阅读 · 2011年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

On Stability and Generalization of Bilevel Optimization Problem

Arxiv

0+阅读 · 2023年3月15日

Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators

Arxiv

0+阅读 · 2023年3月15日

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Optimal and Heuristic Min-Reg Scheduling Algorithms for GPU Program

Arxiv

0+阅读 · 2023年3月13日

Estimating a potential without the agony of the partition function

Arxiv

0+阅读 · 2023年3月11日

Symbol-based Convergence Analysis in Block Multigrid Methods with Applications for Stokes Problems

Arxiv

0+阅读 · 2023年3月10日

Error bound analysis of the stochastic parareal algorithm

Arxiv

0+阅读 · 2023年3月10日

Convergence of a series associated with the convexification method for coefficient inverse problems

Arxiv

0+阅读 · 2023年3月10日

On the numerical solution to an inverse medium problem

Arxiv

0+阅读 · 2023年3月10日

Doubly Robust Stein-Kernelized Monte Carlo Estimator: Simultaneous Bias-Variance Reduction and Supercanonical Convergence

Arxiv

0+阅读 · 2023年3月10日

VIP会员

文章信息

相关主题

梯度下降法

梯度消失问题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

On Stability and Generalization of Bilevel Optimization Problem

Arxiv

0+阅读 · 2023年3月15日

Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators

Arxiv

0+阅读 · 2023年3月15日

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Optimal and Heuristic Min-Reg Scheduling Algorithms for GPU Program

Arxiv

0+阅读 · 2023年3月13日

Estimating a potential without the agony of the partition function

Arxiv

0+阅读 · 2023年3月11日

Symbol-based Convergence Analysis in Block Multigrid Methods with Applications for Stokes Problems

Arxiv

0+阅读 · 2023年3月10日

Error bound analysis of the stochastic parareal algorithm

Arxiv

0+阅读 · 2023年3月10日

Convergence of a series associated with the convexification method for coefficient inverse problems

Arxiv

0+阅读 · 2023年3月10日

On the numerical solution to an inverse medium problem

Arxiv

0+阅读 · 2023年3月10日

Doubly Robust Stein-Kernelized Monte Carlo Estimator: Simultaneous Bias-Variance Reduction and Supercanonical Convergence

Arxiv

0+阅读 · 2023年3月10日

相关基金

基于光学超晶格实现光纤通讯和量子存储波段的多色连续变量纠缠光场

国家自然科学基金

0+阅读 · 2015年12月31日

振荡型积分的有界性质及其在色散方程中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

动力系统量和特征值的强连续性及最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

基于多重分形理论的岩体失稳破裂微震前兆信息识别

国家自然科学基金

0+阅读 · 2012年12月31日

不确定环境下集装箱码头物流运作能力仿真建模与动态评估

国家自然科学基金

0+阅读 · 2011年12月31日

Fermi伽玛射线脉冲星及脉冲星风云的高能物理特性

国家自然科学基金

0+阅读 · 2011年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员