以低精度数字格式排列的慢精度渐变源的趋同 (On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats) - 专知论文

会员服务 ·

0

SGD · Performer · 随机梯度下降 · MoDELS · 规范化的 ·

2023 年 1 月 9 日

On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

翻译：以低精度数字格式排列的慢精度渐变源的趋同

Matteo Cacciola,Antonio Frangioni,Masoud Asgharian,Alireza Ghaffari,Vahid Partovi Nia

Deep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing. Stochastic Gradient Descent (SGD) is the main tool for training such models, where the computations are usually performed in single-precision floating-point number format. The convergence of single-precision SGD is normally aligned with the theoretical results of real numbers since they exhibit negligible error. However, the numerical error increases when the computations are performed in low-precision number formats. This provides compelling reasons to study the SGD convergence adapted for low-precision computations. We present both deterministic and stochastic analysis of the SGD algorithm, obtaining bounds that show the effect of number format. Such bounds can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.

翻译：深层学习模型正在主导几乎所有的人工智能任务,如视觉、文字和语音处理。斯托克渐变源(SGD)是培训这些模型的主要工具,在这些模型中,计算通常以单精度浮动点数格式进行。单精度SGD的趋同通常与实际数字的理论结果一致,因为它们显示出微不足道的错误。然而,当计算以低精度数字格式进行时,数字错误会增加。这为研究为低精度计算而调整的 SGD 趋同提供了令人信服的理由。我们对SGD 算法进行了确定性和随机分析,获得了显示数字格式效果的界限。这些界限可以提供指导方针,说明在限制进行高精度计算的可能性时,SGD趋同会如何受到影响。

0

相关内容

SGD

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

微纳米结构Ti/Mg双连续相复合材料的可控制备及力学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

锂离子电池富锂Fe-Mn基复合材料微纳米结构调控及电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Al-0.2Sc-0.04（Zr,Yb）合金高温蠕变机理

国家自然科学基金

0+阅读 · 2012年12月31日

Navier-Stokes方程的三角形cut-cell自适应有限元方法

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks

Arxiv

0+阅读 · 2023年3月3日

Auxiliary Functions as Koopman Observables: Data-Driven Polynomial Optimization for Dynamical Systems

Auxiliary Functions as Koopman Observables: Data-Driven Polynomial Optimization for Dynamical Systems

Arxiv

0+阅读 · 2023年3月2日

Training Efficient Controllers via Analytic Policy Gradient

Arxiv

0+阅读 · 2023年3月2日

Neuroevolution Surpasses Stochastic Gradient Descent for Physics-Informed Neural Networks

Arxiv

0+阅读 · 2023年3月2日

Tight Risk Bounds for Gradient Descent on Separable Data

Arxiv

0+阅读 · 2023年3月2日

Defending against Adversarial Audio via Diffusion Model

Arxiv

0+阅读 · 2023年3月2日

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

Arxiv

0+阅读 · 2023年3月2日

Conditional Poisson Stochastic Beam Search

Arxiv

0+阅读 · 2023年3月1日

Non-Asymptotic Analysis of Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年3月1日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks

Arxiv

0+阅读 · 2023年3月3日

Auxiliary Functions as Koopman Observables: Data-Driven Polynomial Optimization for Dynamical Systems

Auxiliary Functions as Koopman Observables: Data-Driven Polynomial Optimization for Dynamical Systems

Arxiv

0+阅读 · 2023年3月2日

Training Efficient Controllers via Analytic Policy Gradient

Arxiv

0+阅读 · 2023年3月2日

Neuroevolution Surpasses Stochastic Gradient Descent for Physics-Informed Neural Networks

Arxiv

0+阅读 · 2023年3月2日

Tight Risk Bounds for Gradient Descent on Separable Data

Arxiv

0+阅读 · 2023年3月2日

Defending against Adversarial Audio via Diffusion Model

Arxiv

0+阅读 · 2023年3月2日

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

Arxiv

0+阅读 · 2023年3月2日

Conditional Poisson Stochastic Beam Search

Arxiv

0+阅读 · 2023年3月1日

Non-Asymptotic Analysis of Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年3月1日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

微纳米结构Ti/Mg双连续相复合材料的可控制备及力学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

锂离子电池富锂Fe-Mn基复合材料微纳米结构调控及电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Al-0.2Sc-0.04（Zr,Yb）合金高温蠕变机理

国家自然科学基金

0+阅读 · 2012年12月31日

Navier-Stokes方程的三角形cut-cell自适应有限元方法

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员