用于 Stochatic Quasi- Newton三次立体常规化优化的新创快快速快速子问题解答器 (A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization) - 专知论文

会员服务 ·

0

优化器 · 正则化项 · 确切的 · 无约束优化 · FAST ·

2022 年 4 月 19 日

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

翻译：用于 Stochatic Quasi- Newton三次立体常规化优化的新创快快速快速子问题解答器

Jarad Forristal,Joshua Griffin,Wenwen Zhou,Seyedalireza Yektamaram

from arxiv, 14 pages, 1 figures, 3 tables

In this work we describe an Adaptive Regularization using Cubics (ARC) method for large-scale nonconvex unconstrained optimization using Limited-memory Quasi-Newton (LQN) matrices. ARC methods are a relatively new family of optimization strategies that utilize a cubic-regularization (CR) term in place of trust-regions and line-searches. LQN methods offer a large-scale alternative to using explicit second-order information by taking identical inputs to those used by popular first-order methods such as stochastic gradient descent (SGD). Solving the CR subproblem exactly requires Newton's method, yet using properties of the internal structure of LQN matrices, we are able to find exact solutions to the CR subproblem in a matrix-free manner, providing large speedups and scaling into modern size requirements. Additionally, we expand upon previous ARC work and explicitly incorporate first-order updates into our algorithm. We provide experimental results when the SR1 update is used, which show substantial speed-ups and competitive performance compared to Adam and other second order optimizers on deep neural networks (DNNs). We find that our new approach, ARCLQN, compares to modern optimizers with minimal tuning, a common pain-point for second order methods.

翻译：在这项工作中,我们描述的是使用“立方体”(ARC)方法的适应性常规化(ARC)方法,用于大规模非闭塞-牛顿(LQN)矩阵的大规模非闭关优化。ARC方法是一个相对新的优化战略组合,它使用“立方正”来取代信任区域和线搜索。LQN方法提供了使用明确的二阶信息的大规模替代方法,它采用与流行的第一阶方法(如随机梯度下移(SGD)所使用的方法)相同的投入。解决CR子问题完全需要牛顿的方法,但使用LQN矩阵内部结构的特性。ARC方法是一个相对较新的优化战略,它使用“立方体”来取代信任区域和线搜索。LQN方法提供了使用“立方”术语的精确解决方案,取代信任区域和线搜索。LQN。此外,我们扩大了ARC以前的工作,并明确将第一阶级更新纳入我们的算法。我们使用SR1更新时,我们提供了实验结果,这显示与亚当和其他第二阶次优化的同步方法相比,与我们的深层神经最优化的常规网络相比,我们找到了“最优化的第二阶”的第二阶中最优化方法。

0

相关内容

优化器

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

重椭圆方程的弱有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

套代数框架下时变线性系统的同时稳定化

国家自然科学基金

0+阅读 · 2015年12月31日

Omi/HtrA2在运动性骨骼肌损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

旋转式重力梯度仪用加速度计动静态参数匹配与补偿方法

国家自然科学基金

0+阅读 · 2013年12月31日

粘弹性Oldroyd型流体运动方程长时间行为的快速逼近算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

色噪声激励下非线性系统的随机动力学研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于图域几何PDE与特征不变量的离散曲面处理

国家自然科学基金

0+阅读 · 2009年12月31日

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Arxiv

0+阅读 · 2022年6月10日

Fast Hierarchical Games for Image Explanations

Arxiv

0+阅读 · 2022年6月9日

SDQ: Stochastic Differentiable Quantization with Mixed Precision

Arxiv

0+阅读 · 2022年6月9日

Boosting the Confidence of Generalization for $L_2$-Stable Randomized Learning Algorithms

Arxiv

0+阅读 · 2022年6月8日

A Control with Patterns Approach for the Stability Issue of Dynamical Systems Operating in Data-Rich Environments

Arxiv

0+阅读 · 2022年6月8日

A Fast Convergent Ordered-Subsets Algorithm with Subiteration-Dependent Preconditioners for PET Image Reconstruction

Arxiv

0+阅读 · 2022年6月8日

Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD

Arxiv

0+阅读 · 2022年6月7日

Generalization Error Bounds for Deep Neural Networks Trained by SGD

Arxiv

0+阅读 · 2022年6月7日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

VIP会员

文章信息

相关主题

无约束优化

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Arxiv

0+阅读 · 2022年6月10日

Fast Hierarchical Games for Image Explanations

Arxiv

0+阅读 · 2022年6月9日

SDQ: Stochastic Differentiable Quantization with Mixed Precision

Arxiv

0+阅读 · 2022年6月9日

Boosting the Confidence of Generalization for $L_2$-Stable Randomized Learning Algorithms

Arxiv

0+阅读 · 2022年6月8日

A Control with Patterns Approach for the Stability Issue of Dynamical Systems Operating in Data-Rich Environments

Arxiv

0+阅读 · 2022年6月8日

A Fast Convergent Ordered-Subsets Algorithm with Subiteration-Dependent Preconditioners for PET Image Reconstruction

Arxiv

0+阅读 · 2022年6月8日

Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD

Arxiv

0+阅读 · 2022年6月7日

Generalization Error Bounds for Deep Neural Networks Trained by SGD

Arxiv

0+阅读 · 2022年6月7日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

重椭圆方程的弱有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

套代数框架下时变线性系统的同时稳定化

国家自然科学基金

0+阅读 · 2015年12月31日

Omi/HtrA2在运动性骨骼肌损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

旋转式重力梯度仪用加速度计动静态参数匹配与补偿方法

国家自然科学基金

0+阅读 · 2013年12月31日

粘弹性Oldroyd型流体运动方程长时间行为的快速逼近算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

色噪声激励下非线性系统的随机动力学研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于图域几何PDE与特征不变量的离散曲面处理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员